Engineering Manager, Cloud Platform

Reposted 8 Days Ago
Be an Early Applicant
94306, Palo Alto, CA, USA
Hybrid
190K-240K Annually
Senior level
Artificial Intelligence
The Role
The Engineering Manager will lead the cloud platform team to enhance reliability and scalability, oversee a team of engineers, and improve the platform's operational performance, focusing on observability, intelligence, and orchestration capabilities.
Summary Generated by Built In
GPU racks pull 120–140 kW today. By 2027, that number hits 600 kW to 1 MW per rack. The entire AI buildout — hundreds of billions in capex — is being erected on a grid that was not designed for it. Design margins have compressed from 30% to 10–15%. The monitoring systems built for the last generation of infrastructure poll at one-second intervals. GPU workloads ramp in eight milliseconds.

AI is accelerating faster than the infrastructure beneath it can be understood.
 
The incumbent vendors — Schneider, Eaton, Vertiv — were built for a world where loads were predictable and slow. They are not broken. They are mismatched to what AI infrastructure demands. Verdigris captures continuous waveforms at 8 kHz. That is not a software improvement on existing monitoring data. It is a different measurement entirely — one that makes visible what no other system can see: hidden degradation, safe operating headroom, and the real-time electrical behavior of infrastructure running at the edge of its design limits.
 
We are not a monitoring solution. We are the electrical intelligence layer — the validation layer that sits between the physical environment and the autonomous control systems the industry is building toward. Solving this matters beyond the business case. Carbon-free AI, stranded capacity recovery, and the long-term reliability of the compute layer the world is betting on all depend on getting electrical intelligence right at the physical layer.
The company
 
Twenty people. Lean by design. We have raised serious capital, refocused the company around the most consequential problem in AI infrastructure, and come out the other side with real customers, real revenue, and hardware that has been running in colocation and owned data center facilities for more than a decade. The cloud platform processes billions of 8 kHz waveform readings and turns them into validated operating limits that operators use daily.
 
This unique position—built on our high-fidelity 8 kHz metering—converts the strain on electrical infrastructure into a definitive roadmap for solving the AI industry's most critical power bottleneck and driving the sector's next wave of technological improvement.
 
Today that means reliability and early warning. Tomorrow it means capacity optimization and machine-facing orchestration APIs that GPU schedulers consume directly.
The role
 
We are hiring an Engineering Manager to own the cloud platform — the system that makes all three product pillars work: Observability, Intelligence, and Orchestration.
 
You would manage a team of elite engineers, report to the cofounder/CTO, and hold a mandate to raise the bar on how this team builds and ships. This is a player-coach role. You will set direction, run the engineering operating cadence, and manage people. You will also read code, debug production issues, and make architectural calls. If you have not been in a codebase recently, this is not the right fit.
 
We are building the management layer to accelerate towards best-in-class industry standards: clear ownership, a culture of high craft, and leadership that empowers and accelerates rather than administrates. The candidate we want believes in this velocity.
 
One more thing: a big part of how we operate is through deliberate, opinionated use of agentic coding tools. The team is actively migrating towards an AI-native culture, learning how to adopt practices that scale. You will be instrumental in defining and coaching the next standard for AI-native development here, and you will recruit and coach to that standard.
The situation
 
The platform works. Customers depend on it. The 8 kHz ingestion pipeline is real and running in production.
 
The platform is at a strategic inflection point: we must mature the architecture and organizational structure to support the scale and velocity of our next-generation product roadmap. We need someone who can take ownership of the platform, organize the team around clear ownership, and raise the quality bar — while also building toward future application layers that do not exist yet.

First 6 months

  • Audit the platform: reliability, scalability, observability, tech debt. Form your own view, not just ours.
  • Organize ownership across the three-pillar stack. Ingestion and the 8 kHz pipeline. ML signal processing and validated operating limits. The APIs, MCPs, and workflows that deliver them.
  • Stand up an engineering operating cadence: roadmap reviews, incident reviews, delivery planning, architecture reviews.
  • Get your hands dirty on the hardest reliability and performance problems. Ship fixes, not just plans.
  • Establish AI-native development practices on the team. Not a policy — real tooling norms, a shared view on where agentic coding accelerates, and where it creates new risk.
  • Identify hiring gaps and start filling them. Raise the bar on who we bring in.

By 12 months, here is what success looks like

  • Platform reliability and deployment velocity are measurably better. Fewer fires, faster fixes.
  • The team ships consistently with clear ownership. They do not need you in every decision.
  • There is an engineering roadmap people trust — one that connects today’s reliability work to the capacity optimization and orchestration capabilities we are building toward.
  • You have made at least two hires who made the team noticeably stronger.
  • We are capitalizing on well-architected foundations, enabling us to move up the value delivery chain with our customers through a suite of well thought-through applications.
  • The platform is positioned to support machine-facing orchestration APIs: the layer where validated intelligence feeds directly into GPU schedulers and demand response systems.

What we are looking for

  • Real technical depth in cloud infrastructure, data systems, or ML platforms. You can review architecture, debug production, and make tradeoffs — not just delegate them.
  • You have inherited or built a small team before and made it better. You set expectations, build ownership, and coach people up.
  • You can operate without a clean roadmap. You turn ambiguity into a plan with owners and timelines.
    You care about production quality. Observability, incident response, release discipline. You build the habits, not just the systems.
  • You have strong opinions about how agentic coding tools change what a small team can build. You are actively shaping how your team works with AI — and you have the judgment to know where it helps and where it introduces new failure modes.
  • You are pulled by the mission. AI infrastructure is being built on a foundation that was not designed for it. Verdigris is the layer that makes it trustworthy. That framing should feel meaningful to you, not just interesting.

Why this role

  • You would work directly with the founding team and own the platform that makes the product work.
  • The company is small enough that your decisions show up in the product and the culture within months. A lean team, operating with the right practices and the right people, can build like a team ten times its size. You will define what that looks like here.
  • The 8 kHz ingestion pipeline is already running in production. You are not starting from zero. You are taking something real and making it significantly better — on infrastructure that actually matters.
  • If you are at a bigger company wondering whether you will ever get to build something from a position of real ownership, this is that role.

Skills Required

  • Experience in cloud infrastructure, data systems, or ML platforms
  • Experience managing small engineering teams
  • Ability to organize teams and set expectations
  • Familiarity with observability and incident response
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Moffett Field, CA
40 Employees
Year Founded: 2012

What We Do

Verdigris is an artificial intelligence IoT platform that makes buildings smarter and more connected while reducing energy consumption and costs. By combining proprietary hardware sensors, machine learning, and software, Verdigris “learns” the energy patterns of a building. Their AI software produces comprehensive reports including energy forecasts, alerts about faulty equipment, maintenance reminders, and detailed energy usage information for each and every device and appliance. Verdigris offers a suite of applications that gives building engineers a comprehensive overview, an “itemized utility bill”, powerful reporting, and simple automation tools for their facility. For more information, visit www.verdigris.co.

Similar Jobs

Lambda Logo Lambda

Engineering Manager

Artificial Intelligence • Cloud • Machine Learning • Infrastructure as a Service (IaaS)
Hybrid
San Francisco, CA, USA
750 Employees
330K-440K Annually
Hybrid
San Francisco, CA, USA
106 Employees
330K-440K Annually

Aerospike Logo Aerospike

Engineering Manager

Big Data • Software
In-Office or Remote
Mountain View, CA, USA
191 Employees
230K-260K Annually

Hewlett Packard Enterprise Logo Hewlett Packard Enterprise

Software Engineering Manager

Artificial Intelligence • Cloud • Information Technology • Consulting
In-Office
Cupertino, CA, USA
85422 Employees
156K-315K Annually

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York City, NY
100 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account