Lead ML Engineer

Reposted 11 Days Ago
2 Locations
Hybrid
220K-250K Annually
Senior level
Cloud • Software
The Role
Lead the development and implementation of AI and ML solutions, transforming prototypes into scalable products while managing a team of specialists and driving technical strategy.
Summary Generated by Built In
About the Role:

The ML problems that define the future of cloud cost-per-anything CloudZero is the cost-per-anything model for cloud and Al - for humans and the agents spend

they deploy. We're inverting cost intelligence: from billing-first to telemetry-first. Every

CloudZero is inverting the traditional cost intelligence model. Engineering decision is a buying decision - Instead of starting from the monthly bill, we're building toward

and we're building the platform that proves it in a telemetry-first platform — lightweight collection agents real time.inside customer environments, capturing every Al inference

event, cloud resource usage, and product telemetry signal in Telemetry-FirstCost-to-Produce Al Inference Agentic Governance ML-Powered real time. That data is reconciled against billing to produce total cost-to-produce intelligence. Not just COGS. The full picture.

Al is making every company look like a multi-tenant SaaS. Every enterprise now has per-model, per-token, per-customer Al inference complexity — and no one has a

real-time answer for how to measure, govern, and optimize it. CloudZero is building that answer: a multi-tier architecture spanning real-time streaming (Kafka, Flink/KStreams), batch billing reconciliation, and an intelligent governance layer for both human engineers and the autonomous agents they deploy. Most of what makes this role extraordinary is what

we're building next. This is a founding technical engineer role. You won't be managing a team on day one — you'll be anchoring one. You'll set the technical patterns, solve the hardest data science problems in the product, and help build the team around you. The vision: CloudZero becomes the cost-per-anything model for cloud and Al — for humans and

the agents they deploy.

6 hard ML problems. They sit at the intersection of financial telemetry, cloud infrastructure, Al inference, and massive scale. Some are live in product today; several are what we're building next.

  • Real-time Unit Economics: Calculate per-unit costs across millions of transactions with dynamic efficiency management

  • Predictive Cost Intelligence: Predict and prevent cost efficiency breaches before they impact business

  • Multi-Cloud Attribution: Accurately attribute cloud spend across complex systems using probabilistic modeling

  • Autonomous Optimization: Build AI agents that make safe infrastructure changes within business constraints

Responsibilities:
  • Lead by example: spend 60-70% of your time building, architecting, and solving technical problems

  • Prototype novel ML/AI research ideas, and help translate them into production-ready systems that handle enterprise scale

  • Build AI-powered features (in partnership with product/engineering teams) for cost optimization, anomaly detection, and predictive analytics

  • Establish technical standards and development processes for AI/ML systems

  • Build and develop a small team of AI/ML specialists

  • Provide hands-on coaching and technical guidance to team members

  • Foster a culture of innovation, continuous learning, and customer focus

  • Lead by example in technical decision-making and problem-solving approach

  • Partner closely with engineering teams to embed AI throughout the platform

  • Translate complex AI concepts into business value for executives and customers

  • Drive AI strategy alignment with company vision and product roadmap

  • Represent CloudZero's AI capabilities in customer conversations and industry events

Qualifications:
  • 6+ years in ML engineering and/or data science disciplines, with meaningful time in production systems at scale

  • Deep time-series fluency — you've built forecasting and anomaly detection systems that made it to production and earned customer trust

  • Classical ML foundations — graphs, clustering, probabilistic modeling, data structures. You reach for the right tool, not the trendiest one

  • Production ML engineering — you've owned the full stack: feature engineering, model serving, monitoring, retraining pipelines, feedback loops

  • Python fluency and data warehouse experience (Snowflake, BigQuery, or equivalent)

  • Formal background — in Computer Science, Statistics, Mathematics, or a related quantitative field

  • GenAI/LLM experience — you've integrated LLMs, seen their failure modes, and know when to use them vs.traditional ML

  • Cloud ML infrastructure — AWS SageMaker, Bedrock, or equivalent. Building systems at enterprise scale in AWS/GCP

  • FinOps or cost intelligence domain nice to have - understanding of cloud billing, infrastructure cost models, or related financial data

  • Founding IC experience — you've been the first or second data scientist and know what it takes to build from scratch

  • Graph modeling and semantic layers — knowledge graphs, entity resolution, or semantic modeling in production contexts

  • Bias toward correctness — you care whether models are actually right, not just accurate on a validation set

About CloudZero:

Cloud cost management is one of the biggest challenges organizations face today. As cloud adoption continues to accelerate, so do the complexities and costs associated with it — and macroeconomic conditions only increase pressure to prove cloud efficiency.

CloudZero is a SaaS platform at the intersection of next-generation cloud cost management and FinOps. We ingest billing and usage data from all cloud, SaaS, and PaaS providers, organize it in real time according to our customers' business structures, and empower organizations to make more informed business decisions.

Since our founding in 2016, our mission has been to make efficient innovation a reality for every cloud-driven organization. We believe every engineering decision is a buying decision, and we're applying proven reliability engineering principles to financial efficiency.

We believe the best AI empowers users with clear insights and confident decisions, transforming complex cloud cost data into actionable intelligence that drives meaningful business outcomes.

To date, we've raised over $56 million from leading venture capital firms. We're solving problems of massive scale, business importance, and complexity in a space that needs it more than ever.

Equal Opportunity Employer

CloudZero is an equal opportunity employer and values diversity. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status or disability status. All job offers are contingent upon the candidate passing background and reference checks.

Please note: CloudZero is unable to sponsor employment visas. Candidates must have permanent authorization to work in the United States without the need for current or future sponsorship.

Top Skills

Bedrock
Python
Sagemaker
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Boston, MA
180 Employees
Year Founded: 2016

What We Do

CloudZero is the only cloud cost intelligence platform that puts engineering in control by connecting technical decisions to business results.

CloudZero ingests cost data from AWS and Snowflake, organizes it for analysis, and delivers the insights to engineering teams who can understand how their work is impacting the business.

You can answer question like:

* Who are my most expensive customers?
* Which product, feature, and team is spending the most?
* Has the profitability of my product changed quarter over quarter?

The outcome is real-time intelligence that helps companies control their cost of goods sold (COGS) and gross margins — aligning engineering and finance teams once and for all.

Similar Jobs

Capital One Logo Capital One

Lead Machine Learning Engineer

Fintech • Machine Learning • Payments • Software • Financial Services
Hybrid
4 Locations
55000 Employees
197K-246K Annually

Motional Logo Motional

Principal Engineer

Artificial Intelligence • Automotive • Machine Learning • Transportation
Remote or Hybrid
U.S.
765 Employees
240K-330K Annually

Layer Health Logo Layer Health

Lead Machine Learning Engineer

Artificial Intelligence • Healthtech • Software
Easy Apply
In-Office
2 Locations
33 Employees
190K-240K Annually

Motional Logo Motional

Principal Engineer

Artificial Intelligence • Automotive • Machine Learning • Transportation
Remote or Hybrid
3 Locations
765 Employees
240K-290K Annually

Similar Companies Hiring

Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account