Software Engineer, Hermetic Build

Reposted 8 Days Ago
Be an Early Applicant
San Francisco, CA, USA
In-Office
175K-250K Annually
Senior level
Artificial Intelligence • Cloud • Information Technology • Infrastructure as a Service (IaaS)
The Role
You will lead the build system and CI/CD processes at SFCompute, focusing on improving reproducibility, hermeticity, and speed. Your role includes auditing current systems, migrating to a new build system, ensuring effective CI, and collaborating with teams on new infrastructure developments.
Summary Generated by Built In

We're building the company which will de-risk the largest infrastructure build-out in history.

When people finance GPU clusters, the datacenters housing them, and the infrastructure powering them, they need "offtake" - meaning someone has signed a contract to lease the cluster for a period of time before its even built.

Financing a GPU cluster is inherently risky, since margins are thin and volumes are huge. Lenders don't want to take on the risk that cluster developers can't repay their loan, and cluster developers really don't want to risk not selling their cluster. As a result, risk is offloaded to the customer using fixed-price long-term contracts.

If you don't mitigate this customer risk, there's a bubble. This isn't SaaS anymore - application layer companies sign multi-year contracts for computer and inference, but sell to customers on monthly subscriptions. If you mess up a purchase, it's game over: a minor shift in your revenue growth rate might mean the difference between profit or bankruptcy. But what if companies could exit their contract by selling it back to the market?

Otherwise, as AI scales, compute only becomes available to folks who can effectively take on that risk. A 2-person startup in a San Francisco Victorian can't realistically sign a 5-year take or pay contract on $100m supercomputers. But they may be able to buy the month of liquidity that someone else sold back.

So that's what we make: a liquid market for GPU offtake.

About SF Compute

The San Francisco Compute Company runs large-scale GPU clusters (H100s, H200s, B300s) on contracts you can exit. Need 256 H100s for three days? Buy them at market price, cancel what you don't use. We operate the stack from UEFI up, so you're never paying a reseller markup or waiting on a support ticket. Customers include NVIDIA, MIT, Liquid AI, and Roboflow. We're a small team that has managed over $1B of hardware and is building what we think will be the defining infrastructure marketplace for the AI era.

The Role

We need someone who has run a serious build system at a previous job, ideally a large Bazel monorepo, and wants to do it again here. Our codebase is a TypeScript monorepo, a Rust workspace, a protobuf layer that wires them together, and a growing pile of services and container images. CI works. It isn't hermetic, it isn't deterministic, and the cache hit rates are nowhere near where they should be. That's the work.

You'll own the build and CI experience top to bottom. We're not religious about Bazel. If Buck2 fits better, or a simpler setup gets us 80% of the value, that's fine. The goal is local and CI builds that produce the same artifact, fast incremental feedback for every engineer, and a credible roadmap for what this looks like at 10x our current size.

What You'll Do
  • Audit the current build and test pipeline (Bun for TypeScript, Cargo for Rust, buf for protobuf, plus Docker and Helm) and write down where it fails on reproducibility, hermeticity, and speed

  • Pick a build system and migrate us onto it without breaking shipping

  • Stand up remote execution and remote caching that actually move CI and local build times

  • Pin toolchains, seal dependencies, and stop the host environment from leaking into builds

  • Run the long-term roadmap for build, test, and CI as the team and codebase grow

  • Work alongside application and infrastructure engineers throughout, since the migration touches all of them

What We're Looking For
  • Senior or staff-level experience running Bazel, Buck2, Pants, or a comparable system somewhere the build system genuinely mattered

  • Experience operating remote execution and remote caching in production

  • Comfortable across language ecosystems. We run TypeScript and Rust today, with Python showing up.

  • Strong opinions on determinism and reproducibility, with the judgment to know when full hermeticity is worth the cost and when it isn't

  • CI ops chops: queue health, flake budgets, real test signal, build time budgets you can defend

  • Able to scope your own work. There's no spec for what our build system should look like.

  • Nice to have: experience moving a codebase onto Bazel (or off of it), polyglot or protobuf-heavy monorepos, prior work on developer infrastructure at an autonomy, robotics, or systems company

Why This Role

Build systems are one of the few pieces of infrastructure where every hour you save shows up for every engineer in the company. Doing this well before we're 10x the size is one of the most leveraged things we can do right now. You pick the tools, you set the standards, and you own the outcome.

BenefitsGenerous equity grant

Team members are offered a competitive salary along with equity in the company

Visa Sponsorships

Yes, we sponsor visas and work permits

Retirement matching

We match 401(k) plans up to 4%

Medical, dental & vision

We offer competitive medical, dental, vision insurance for employees and dependents and cover 100% of premiums

Time off

We offer unlimited paid time off as well as 10+ observed holidays

Parental leave

We offer biological, adoptive, and foster parents paid time off to spend quality time with family

Daily lunch

We cover lunch daily for employees

Unlimited office book budget

You can buy as many books for the office as you want

The San Francisco Compute Company is committed to maintaining a workplace free from discrimination and harassment.

We make employment decisions based on business needs, job requirements, and individual qualifications, without regard to race, color, religion, belief, national origin, social or ethical origin, age, physical, mental, or sensory disability, sexual orientation, gender identity or expression, marital status, civil union or domestic partnership status, past or present military service, HIV status, family medical history or genetic information, family or parental status including pregnancy, or any other status protected by law.

We welcome the opportunity to consider qualified applicants with prior arrest or conviction records. Our commitment to diversity includes hiring talented individuals regardless of their criminal history, in accordance with local, state, and federal laws, including San Francisco’s Fair Chance Ordinance and California’s ban-the-box laws.

Skills Required

  • Senior or staff-level experience running Bazel, Buck2, Pants or comparable build systems
  • Experience operating remote execution and remote caching in production
  • Comfortable across language ecosystems, specifically TypeScript and Rust
  • Strong opinions on determinism and reproducibility
  • CI ops experience in queue health and build time budgets
  • Able to scope your own work without defined specifications
  • Experience moving codebases onto Bazel or off of it (nice to have)
  • Experience with polyglot or protobuf-heavy monorepos (nice to have)
  • Prior work on developer infrastructure at autonomy, robotics, or systems company (nice to have)
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
30 Employees
Year Founded: 2023

What We Do

San Francisco Compute Company operates a marketplace for large-scale GPU clusters, enabling users to buy and sell compute contracts with flexible terms. They aim to make AI compute more accessible and affordable by creating a liquid market for GPU offtake.

Similar Jobs

Comcast Logo Comcast

Senior Account Executive

Digital Media • Information Technology • News + Entertainment
Hybrid
Santa Maria, CA, USA
115000 Employees
65K-115K Annually

Comcast Logo Comcast

Sales Manager

Digital Media • Information Technology • News + Entertainment
Hybrid
Walnut Creek, CA, USA
115000 Employees
132K-183K Annually

Comcast Logo Comcast

Account Executive

Digital Media • Information Technology • News + Entertainment
Remote or Hybrid
California, USA
115000 Employees
84K-157K Annually

Airwallex Logo Airwallex

Director of Solutions Engineering - US

Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
Remote or Hybrid
San Francisco, CA, USA
2200 Employees

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
31 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account