Software Engineer, Build Systems / CI

Posted 3 Days Ago
3 Locations
In-Office
185K-490K Annually
Senior level
Artificial Intelligence • Machine Learning • Generative AI
The Role
Design and operate CI infrastructure to improve developer productivity and reliability in build systems, enhancing tool efficacy and performance across teams.
Summary Generated by Built In
About the Role

The Engineering Acceleration team builds and operates the foundational systems that engineers use to build, test, and ship ChatGPT, the API, and OpenAI's infrastructure.

We are looking for an engineer to help evolve OpenAI's build and continuous integration systems for a fast-growing engineering organization. This role sits at the intersection of developer productivity, build systems, distributed infrastructure, and software quality. You will work on the systems that determine how quickly and confidently engineers can move: Bazel-based builds, Buildkite pipelines, test selection, remote caching and execution, CI observability, and tooling that helps engineers understand and fix failures quickly.

Our mission is to make OpenAI one of the most productive engineering organizations in the world while preserving a high bar for correctness, reliability, and safety. The best version of this work is invisible when it succeeds: builds are fast, tests are trusted, CI failures are understandable, and engineers can focus on shipping useful systems instead of fighting infrastructure.

In This Role, You Will
  • Own and evolve Bazel-based build and test workflows across a large, polyglot monorepo.

  • Design and maintain Starlark rules, macros, toolchains, and integrations that make builds reproducible, hermetic, and easy for product teams to adopt.

  • Improve CI performance and reliability across Buildkite pipelines, including queue time, build time, cache hit rates, test sharding, retry behavior, and flake isolation.

  • Build systems that reduce unnecessary CI work through affected-target detection, dependency graph analysis, test selection, caching, batching, and smarter scheduling.

  • Improve local development workflows so engineers can reproduce CI behavior, debug build failures, and iterate quickly without learning every detail of the build stack.

  • Operate and optimize build infrastructure across Docker/OCI images, Kubernetes-based runners, cloud resources, and remote cache/execution systems.

  • Instrument build and CI systems with metrics, logs, traces, dashboards, and analytics so we can measure speed, reliability, cost, and developer impact.

  • Partner directly with product, infrastructure, and research engineering teams to understand pain points, onboard projects, debug hard build issues, and remove systemic bottlenecks.

  • Use modern AI tools to rethink CI failure analysis, flaky test debugging, PR triage, automatic remediation, and developer-facing explanations.

  • Own the reliability of the systems you build, including participating in an on-call rotation for critical developer infrastructure.

Technologies Commonly Used In This Environment Include
  • Bazel and Starlark for build and test workflows

  • Buildkite for CI orchestration

  • Docker and OCI images for build and runtime packaging

  • Kubernetes for CI runners and infrastructure orchestration

  • Python, Go, TypeScript, Rust, C++, and other languages in a large monorepo

  • Terraform for infrastructure as code

  • Remote caching, remote execution, artifact storage, and build telemetry systems

  • Postgres, Kafka, and internal services used to power engineering platforms

You May Be A Strong Fit If You
  • Have 5+ years of software engineering experience, including significant experience building infrastructure or tooling for developers.

  • Have hands-on experience with Bazel, Buck, Pants, Gradle, or similar build systems, and understand the tradeoffs of hermetic builds, dependency graphs, caching, sandboxing, and remote execution.

  • Have built or operated CI systems at scale, especially in environments where build time, queue time, test flakiness, and developer trust materially affect engineering velocity.

  • Are comfortable writing production software for internal platforms, not just configuring tools. We expect this role to involve code, design, debugging, operations, and long-term ownership.

  • Can debug distributed build and CI failures across source control, dependency management, containers, runners, remote caches, test frameworks, and service infrastructure.

  • Care deeply about developer experience and have empathy for the small sources of friction that slow teams down or create operational toil.

  • Are pragmatic about platform adoption: you know how to build paved paths that teams want to use because they are faster, clearer, and more reliable.

  • Communicate clearly across teams and can turn ambiguous productivity problems into concrete technical plans.

  • Are excited to apply AI to developer infrastructure in ways that make engineers faster without weakening quality, reliability, or safety.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity. 

We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement.

Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.

To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form. No response will be provided to inquiries unrelated to job posting compliance.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

Skills Required

  • 5+ years of software engineering experience
  • Experience building infrastructure or tooling for developers
  • Hands-on experience with Bazel or similar build systems
  • Experience operating CI systems at scale
  • Comfortable debugging distributed systems
  • Ability to work across teams and communicate clearly
  • Excited to apply AI to developer infrastructure

OpenAI Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about OpenAI and has not been reviewed or approved by OpenAI.

  • Equity Value & Accessibility Equity is considered substantial and has become more accessible through tender offers and eased vesting terms. Feedback suggests this provides meaningful upside beyond base pay for many technical roles.
  • Parental & Family Support Parental leave spans 20–24 weeks with post-leave flexibility, alongside generous fertility coverage and family planning support. Feedback suggests these programs materially support caregivers and families.
  • Healthcare Strength Health, dental, and vision insurance are comprehensive and include mental healthcare support. Feedback suggests overall medical coverage is strong and part of a broader wellbeing focus.

OpenAI Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
224 Employees
Year Founded: 2015

What We Do

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. AI is an extremely powerful tool that must be created with safety and human needs at its core. OpenAI is dedicated to putting that alignment of interests first — ahead of profit. To achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity. Our investment in diversity, equity, and inclusion is ongoing, executed through a wide range of initiatives, and championed and supported by leadership. At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

Similar Jobs

Superhuman Logo Superhuman

Sr. Sales Commissions Analyst (Pacific Time Zone)

Artificial Intelligence • Information Technology • Machine Learning • Natural Language Processing • Productivity • Software • Generative AI
Remote or Hybrid
2 Locations
1500 Employees
123K-171K Annually

Cox Enterprises Logo Cox Enterprises

Copywriter

Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Remote or Hybrid
United States
50000 Employees
20-30 Hourly

CoreWeave Logo CoreWeave

Senior Manager, SOX-Business Process

Cloud • Information Technology • Machine Learning
In-Office
Bellevue, WA, USA
1450 Employees
135K-198K Annually

CoreWeave Logo CoreWeave

Senior Product Marketing Manager

Cloud • Information Technology • Machine Learning
In-Office
2 Locations
1450 Employees
177K-237K Annually

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account