CloudZero is growing fast. Our customer base is expanding, the data challenges we're solving are getting more complex, and the platform is scaling to match. As a CloudOps Engineer you'll be a force multiplier for our engineering organization, owning the performance, reliability, and observability of CloudZero's infrastructure and empowering teams to ship features that help customers understand and optimize their cloud spend.
This is real infrastructure work at real scale, not a ticket-closing role or a console-clicking job. CloudZero processes billions of events daily across AWS, Azure, and GCP. Our customers rely on real-time, accurate cost data to make business-critical decisions, and any instability in our system impacts their planning. Built entirely on a unique serverless architecture with no EC2s or containers, our platform demands infrastructure that scales gracefully, fails predictably, and recovers automatically.
If you thrive on hard operational problems, care deeply about reliability and performance, and want to see your work matter to customers in direct and measurable ways, this role was built for you.
Infrastructure as Code
Design and maintain Pulumi modules that provision reliable, cost-efficient cloud resources
Own infrastructure end to end with no clicking through consoles
Observability
Instrument systems so that failures surface quickly and debugging happens with data, not guesswork
Build observability into everything so you know about problems before customers do
Automation
Automate deployments, scaling, backups, and limit changes; if humans are doing it repeatedly, build a system to do it instead
Balance automation intelligently, building solutions to real problems rather than automating for its own sake
Partner with Product Engineering
Help teams design resilient services, review architectures for operational complexity, and build deployment pipelines that enable safe and fast shipping
Optimize for cost and performance; CloudZero's business is helping others optimize cloud costs, and we should be exemplars of efficient cloud usage ourselves
3 to 5+ years of experience building and operating distributed systems in AWS
Strong skills in Python and Infrastructure as Code using Pulumi or Terraform
Experience with frontier AI models such as Claude, Codex, or Gemini
Hands-on experience with monitoring tools such as Prometheus or Datadog
Proven ability to debug production issues under pressure
Values thoughtful, reliable system design over reactive hero efforts
Strong documentation habits to support long-term team clarity and system stability
Ability to clearly explain complex technical issues to non-technical stakeholders
Excited to take ownership of infrastructure and solve operational challenges at scale
Cloud cost management is one of the biggest challenges organizations face today. As cloud adoption continues to accelerate, so do the complexities and costs associated with it, and macroeconomic conditions only increase pressure to prove cloud efficiency.
CloudZero is a SaaS platform at the intersection of next-generation cloud cost management and FinOps. We ingest billing and usage data from all cloud, SaaS, and PaaS providers, organize it in real time according to our customers' business structures, and empower organizations to make more informed business decisions.
Since our founding in 2016, our mission has been to make efficient innovation a reality for every cloud-driven organization. We believe every engineering decision is a buying decision, and we're applying proven reliability engineering principles to financial efficiency.
We believe the best AI empowers users with clear insights and confident decisions, transforming complex cloud cost data into actionable intelligence that drives meaningful business outcomes.
To date, we've raised over $56 million from leading venture capital firms. We're solving problems of massive scale, business importance, and complexity in a space that needs it more than ever.
Equal Opportunity EmployerCloudZero is an equal opportunity employer and values diversity. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status or disability status. All job offers are contingent upon the candidate passing background and reference checks.
Skills Required
- 3 to 5+ years of experience building and operating distributed systems in AWS
- Strong skills in Python and Infrastructure as Code using Pulumi or Terraform
- Experience with frontier AI models such as Claude, Codex, or Gemini
- Hands-on experience with monitoring tools such as Prometheus or Datadog
- Proven ability to debug production issues under pressure
- Strong documentation habits to support long-term team clarity and system stability
- Ability to clearly explain complex technical issues to non-technical stakeholders
CloudZero Compensation & Benefits Highlights
The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about CloudZero and has not been reviewed or approved by CloudZero.
-
Healthcare Strength — Healthcare coverage is described as comprehensive, spanning medical, dental, and vision. This breadth is consistently presented as a core part of the total rewards package.
-
Leave & Time Off Breadth — Paid time off is presented as flexible and generous, with practices like Focus Fridays supporting balance. Remote-first policies and periodic meetups complement the time-off approach.
-
Equity Value & Accessibility — Equity grants are included broadly, giving employees a stake in the company’s success. This equity component is positioned as a meaningful part of total compensation.
CloudZero Insights
What We Do
CloudZero is the only cloud cost intelligence platform that puts engineering in control by connecting technical decisions to business results. CloudZero ingests cost data from AWS and Snowflake, organizes it for analysis, and delivers the insights to engineering teams who can understand how their work is impacting the business. You can answer question like: * Who are my most expensive customers? * Which product, feature, and team is spending the most? * Has the profitability of my product changed quarter over quarter? The outcome is real-time intelligence that helps companies control their cost of goods sold (COGS) and gross margins — aligning engineering and finance teams once and for all.








