Capacity Systems Software Engineer

Posted 10 Days Ago
Be an Early Applicant
San Francisco, CA, USA
In-Office
293K-455K Annually
Senior level
Artificial Intelligence • Machine Learning • Generative AI
The Role
Design and develop software systems for planning and optimizing compute infrastructure; collaborate with various teams to enhance operational efficiency and decision-making.
Summary Generated by Built In

About the Team

OpenAI's Industrial Compute organization is responsible for planning, delivering, operating, and optimizing the compute infrastructure that powers frontier AI.

As OpenAI scales toward becoming an intelligence utility, Industrial Compute coordinates a complex lifecycle spanning infrastructure strategy, capacity planning, provider partnerships, fleet operations, product demand, and financial planning. The organization manages one of the largest and fastest-growing compute footprints in the world, where decisions around capacity allocation, deployment readiness, utilization, reliability, and product demand directly impact product availability, customer experience, and business performance.

The Capacity Systems team builds the software platforms, data systems, and automation frameworks that connect these functions into a shared operating model. We transform fragmented planning workflows into scalable systems that enable teams to understand what compute was contracted, delivered, healthy, allocated, and ultimately converted into business and research outcomes.

About the Role

We are seeking a Capacity Systems Software Engineer to build the platforms and services that power Industrial Compute planning, forecasting, optimization, and operational decision-making.

In this role, you will design and develop software systems that connect infrastructure delivery, fleet health, capacity allocation, demand forecasting, deployment readiness, financial planning, and product consumption into a unified system of record. Your work will help OpenAI make better decisions about where compute should be deployed, how capacity should be allocated, and how infrastructure investments translate into business value.

You will partner closely with Capacity Planning, Fleet Operations, Infrastructure Engineering, Product, Finance, Supply Chain, and Strategic Sourcing teams to replace spreadsheet-driven workflows with scalable software systems that enable visibility, automation, and decision support across OpenAI's global compute footprint.

This role is ideal for engineers who enjoy building internal platforms, operational systems, workflow automation, and data-intensive applications that sit at the intersection of software, infrastructure, and business operations.

This role is based in San Francisco and follows OpenAI's hybrid work model of 3 days per week in the office.

Why This Role Matters

Demand for compute is growing faster than traditional planning systems can support.

Capacity decisions increasingly require understanding complex tradeoffs across infrastructure delivery, fleet health, model deployment, latency, utilization, cost, revenue impact, and operational risk. The systems built by this team help transform compute from a collection of disconnected operational signals into a measurable, optimizable business capability.

Success in this role will help create a world where every major unit of compute can be traced from infrastructure commitment through delivered business value, enabling faster launches, higher utilization, improved operational efficiency, and better decision-making across OpenAI.

Key Responsibilities

  • Design and build software systems that serve as the system of record for Industrial Compute planning and operations.

  • Develop backend services, APIs, workflows, and data platforms that support capacity forecasting, allocation, deployment readiness, and operational planning.

  • Build applications that connect infrastructure delivery, fleet health, capacity utilization, product demand, and financial planning into a shared operational view.

  • Build planning and scenario-modeling systems that help leaders understand tradeoffs across capacity, utilization, cost, reliability, launch timing, and business impact.

  • Create workflow automation and decision-support tooling that improves planning accuracy and reduces operational overhead.

  • Partner with Capacity Planning, Fleet Operations, Product, Finance, Infrastructure Engineering, Supply Chain, and Strategic Sourcing teams to understand operational workflows and translate them into software systems.

  • Drive architecture decisions across planning platforms, operational tooling, and internal infrastructure systems.

  • Improve data quality, observability, and operational visibility across Industrial Compute programs.

  • Build extensible software foundations that scale alongside OpenAI's rapidly growing infrastructure footprint.

Qualifications

  • 5+ years of experience in software engineering, platform engineering, infrastructure engineering, or related technical disciplines.

  • Strong programming experience in Python, Go, Java, TypeScript, or similar languages.

  • Experience building distributed systems, backend services, internal platforms, workflow systems, or operational tooling.

  • Experience designing APIs, data pipelines, and integrations across multiple systems.

  • Strong system design and software architecture skills.

  • Experience working with large operational datasets and business-critical workflows.

  • Ability to operate effectively in highly cross-functional environments and translate ambiguous operational challenges into scalable technical solutions.

  • Strong ownership mindset and ability to independently drive complex projects.

Preferred Skills

  • Experience building planning systems, forecasting platforms, optimization engines, or decision-support tools.

  • Experience with SQL, data warehouses, orchestration frameworks, analytics platforms, and distributed data systems.

  • Experience supporting infrastructure, cloud platforms, data centers, hardware deployment programs, or large-scale operational environments.

  • Familiarity with capacity planning, supply chain systems, financial modeling, or infrastructure operations.

  • Experience replacing spreadsheet-driven workflows with scalable software platforms.

  • Experience building systems that support scenario planning, forecasting, optimization, or resource allocation.

  • Familiarity with AI infrastructure, hyperscale compute environments, or large-scale distributed systems.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity. 

We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement.

Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.

To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

Skills Required

  • 5+ years of experience in software engineering, platform engineering, infrastructure engineering, or related technical disciplines
  • Strong programming experience in Python, Go, Java, TypeScript, or similar languages
  • Experience building distributed systems, backend services, internal platforms, workflow systems, or operational tooling
  • Experience designing APIs, data pipelines, and integrations across multiple systems
  • Strong system design and software architecture skills
  • Experience working with large operational datasets and business-critical workflows
  • Ability to operate effectively in highly cross-functional environments and translate ambiguous operational challenges into scalable technical solutions
  • Strong ownership mindset and ability to independently drive complex projects

OpenAI Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about OpenAI and has not been reviewed or approved by OpenAI.

  • Equity Value & Accessibility Equity is considered substantial and has become more accessible through tender offers and eased vesting terms. Feedback suggests this provides meaningful upside beyond base pay for many technical roles.
  • Parental & Family Support Parental leave spans 20–24 weeks with post-leave flexibility, alongside generous fertility coverage and family planning support. Feedback suggests these programs materially support caregivers and families.
  • Healthcare Strength Health, dental, and vision insurance are comprehensive and include mental healthcare support. Feedback suggests overall medical coverage is strong and part of a broader wellbeing focus.

OpenAI Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
4,500 Employees
Year Founded: 2015

What We Do

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. AI is an extremely powerful tool that must be created with safety and human needs at its core. OpenAI is dedicated to putting that alignment of interests first — ahead of profit. To achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity. Our investment in diversity, equity, and inclusion is ongoing, executed through a wide range of initiatives, and championed and supported by leadership. At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

Similar Jobs

Drata Logo Drata

Product Engineer

Security • Software • Cybersecurity • Automation
Hybrid
San Francisco, CA, USA
600 Employees
167K-226K Annually

Octus Logo Octus

Quantitative Developer

Fintech • News + Entertainment • Software • Database • Financial Services
Easy Apply
Remote or Hybrid
United States
808 Employees
135K-150K Annually

Optum Logo Optum

Appeals M.D. Cardiologist Requried - Remote

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office or Remote
Cypress, CA, USA
160000 Employees
249K-373K Annually

Optum Logo Optum

Appeals M.D. - Oncologist Required - Remote

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office or Remote
Cypress, CA, USA
160000 Employees
249K-373K Annually

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
LTX Thumbnail
Conversational AI • Generative AI
Jerusalem, Israel
360 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account