Software Engineer, Infrastructure Platform

Reposted 2 Days Ago
4 Locations
In-Office
200K-250K Annually
Mid level
Artificial Intelligence • Software
The Role
The role involves developing internal tooling for infrastructure management, building systems for asset lifecycle, automation, monitoring, and collaborating cross-functionally to improve operations.
Summary Generated by Built In
About Fluidstack

At Fluidstack, we’re building the infrastructure for abundant intelligence. We partner with top AI labs, governments, and enterprises - including Mistral, Poolside, Black Forest Labs, Meta, and more - to unlock compute at the speed of light.

We’re working with urgency to make AGI a reality. As such, our team is highly motivated and committed to delivering world-class infrastructure. We treat our customers’ outcomes as our own, taking pride in the systems we build and the trust we earn. If you’re motivated by purpose, obsessed with excellence, and ready to work very hard to accelerate the future of intelligence, join us in building what's next.

About the Role

Fluidstack, a leading cloud provider, is looking for a Software Engineer, Infrastructure Platform to build the foundational platforms that enable our global infrastructure and data center operations. You'll develop comprehensive internal tooling across multiple domains—CMDB, asset management, DCIM, monitoring and observability, security, and operational automation—that streamline how we deploy, manage, and operate infrastructure at scale. Working cross-functionally with engineering, operations, data center teams, and product, you'll deliver scalable, reliable, user-friendly solutions that directly impact our ability to grow and deliver world-class infrastructure services.

Focus

Infrastructure Platform Development

  • Design and build our next-generation CMDB system as the authoritative source of truth for infrastructure assets, network topology, and configuration data

  • Create DCIM platforms for rack operations, server/GPU deployment, OS installation, quality assurance, and white-screen operations

  • Develop end-to-end asset lifecycle management systems covering receiving, racking, inventory, break-fix, and decommissioning workflows

  • Build monitoring and observability platforms integrating telemetry from BMS, EPMS, and IT devices with intelligent alarming and incident management

  • Create self-service portals and automation for new region bootstrap, day-2 operations, and fleet-scale management

Operational Excellence & Automation

  • Eliminate manual toil through workflow automation and self-service tooling that empower operations and engineering teams

  • Build workflow orchestration systems for complex multi-step processes spanning incident, problem, and change management

  • Develop digital twin visualizations and operational dashboards surfacing actionable insights; partner with data teams on analytics

  • Create integration layers connecting internal platforms with external vendors and third-party systems

Cross-Functional Partnership

  • Collaborate with data center operations, system engineering, network engineering, and security teams to understand requirements and deliver high-impact solutions

  • Work with product and business stakeholders to prioritize features, define roadmaps, and balance competing needs

  • Align with support and operations teams to ensure platforms scale with organizational growth

Technical Leadership

  • Evaluate build vs. buy decisions for platform components, weighing in-house development against commercial SaaS and open-source solutions for scalability, cost, and flexibility

  • Champion modern development practices including CI/CD, infrastructure-as-code, automated testing, and observability-first design

  • Participate in architecture reviews and design discussions, contributing to technical direction and standards

  • Foster technical excellence through code reviews, documentation, and knowledge sharing

Scalability & Reliability

  • Design high-performance, fault-tolerant systems capable of handling thousands of QPS as our infrastructure footprint expands

  • Build comprehensive monitoring, logging, and debugging capabilities with robust error handling

  • Implement data migration strategies and manage upstream/downstream dependencies carefully during platform evolution

  • Own projects end-to-end from concept through deployment, ensuring production readiness and operational excellence

About You
  • 3+ years of professional software development experience building production systems

  • Strong programming skills in Python, Go, or similar languages with understanding of system design patterns

  • Experience designing and implementing RESTful APIs, data models, and distributed systems

  • Proficiency with relational and NoSQL databases (PostgreSQL, Redis, etc.)

  • Hands-on experience with containerization (Docker) and infrastructure-as-code tools (Terraform, Ansible)

  • Understanding of CI/CD pipelines and modern development workflows

  • Solid grasp of networking fundamentals (TCP/IP, DNS, HTTP) and Linux/Unix environments

  • Strong problem-solving abilities with attention to scalability, reliability, and operational concerns

  • Excellent communication skills—able to convey technical concepts to both technical and non-technical stakeholders

  • Experience with CMDB systems (NetBox, Device42) or asset management platforms

  • Background in infrastructure automation, DevOps, or platform engineering

  • Familiarity with workflow orchestration frameworks (Temporal, Airflow, Camunda)

  • Knowledge of monitoring and observability stacks (Prometheus, Grafana, OpenTelemetry)

  • Experience with time-series databases and data visualization

  • Understanding of ITSM frameworks (ITIL) and service management practices

  • Experience in data center operations, facilities management, or physical infrastructure

  • Contributions to open-source infrastructure projects

  • Bachelor's degree in Computer Science or equivalent practical experience

Salary & Benefits
  • Competitive total compensation package (salary + equity).

  • Retirement or pension plan, in line with local norms.

  • Health, dental, and vision insurance.

  • Generous PTO policy, in line with local norms.

The base salary range for this position is $200,000 - $250,000 per year, depending on experience, skills, qualifications, and location. This range represents our good faith estimate of the compensation for this role at the time of posting. Total compensation may also include equity in the form of stock options.

We are committed to pay equity and transparency.

Fluidstack is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Fluidstack will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

You will receive a confirmation email once your application has successfully been accepted. If there is an error with your submission and you did not receive a confirmation email, please email [email protected] with your resume/CV, the role you've applied for, and the date you submitted your application-- someone from our recruiting team will be in touch.

Top Skills

Ansible
Ci/Cd
Docker
Go
Grafana
Opentelemetry
Postgres
Prometheus
Python
Redis
Restful Apis
Terraform
Time-Series Databases
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: London
30 Employees
Year Founded: 2017

What We Do

Instantly reserve dedicated clusters of NVIDIA H200s and GB200s for any scale to supercharge your training and inference workflows.

Similar Jobs

CrowdStrike Logo CrowdStrike

Development Engineer

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Hybrid
2 Locations
10000 Employees
120K-180K Annually

CrowdStrike Logo CrowdStrike

Manager, Platform Services Project Management (Remote)

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
2 Locations
10000 Employees
140K-195K Annually

CrowdStrike Logo CrowdStrike

Senior Software Engineer

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Hybrid
3 Locations
10000 Employees
160K-250K Annually

Commerce Logo Commerce

Senior Workday Payroll & Time Tracking Administrator

Artificial Intelligence • Cloud • Consumer Web • eCommerce • Information Technology • Software
In-Office
Austin, TX, USA
1200 Employees
116K-174K Annually

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account