SRE/Infrastructure Engineer

Reposted 9 Hours Ago
Be an Early Applicant
San Francisco, CA, USA
In-Office
200K-350K Annually
Senior level
Artificial Intelligence
Code interpreting for your AI apps | Safely run AI-generated code in E2B sandbox
The Role
The SRE/Infrastructure Engineer will manage Terraform and Kubernetes across cloud platforms, ensuring scalable infrastructure. Responsibilities include multi-cloud deployments, observability, and creating reusable components.
Summary Generated by Built In
About E2B

E2B is a fast-growing Series A startup with 8-figure revenue. We've raised over $37M since our founding in 2023. Our customers include companies like Microsoft, Perplexity, Hugging Face, Manus, and Groq. We're building the next hyperscaler for AI agents.

About the role

You will own the Terraform, Kubernetes, and cloud plumbing that lets E2B run millions of sandboxes.

Today our infrastructure runs on Nomad and Terraform across Google Cloud, with multi-cloud expansion in flight. You'll help migrate us to Kubernetes, builds reusable Terraform components, and harden our self-hosted and BYOC deployments.
This role owns BYOC deployments end-to-end, which means you'll work directly with our enterprise customers. You'll join technical pre-sales conversations, run the deployments, and be the point of contact through rollout and operations. You're comfortable in front of a customer, not just in a terminal.

Your job will be:

  • Making our BYOC offering ready for AWS, Google Cloud, and Azure marketplaces Owning and extending our Terraform footprint across Google Cloud, AWS, and Azure

  • Building reusable Terraform components (networking, IAM, secrets)

  • Wiring up observability and tightening the loop between infra change and production behavior

  • Making BYOC and self-hosted deployments fast and repeatable for our largest customers

  • Working directly with enterprise customers through deployment and operations - pre-sales, rollout, and ongoing support

We're looking for an infrastructure engineer who actually wants to live in Terraform and Kubernetes every day, and who reaches for modern AI tooling to move faster.
A note on scope: you own how BYOC is deployed, operated, and standardized. Product and custom code changes are owned by our Platform team. You'll partner with them rather than fork the product per customer. Deep system-internal issues escalate into Platform/Core. This keeps your focus on making deployments repeatable, not maintaining snowflakes

If words like Terraform modules, Kubernetes operators, GCP VPCs, IAM bindings, Cloudflare workers, and BYOC deployments sound like a good Tuesday, we want to hear from you.

What we're looking for

  • 5+ years operating production cloud infrastructure - You've owned Terraform and at least one orchestrator (Kubernetes, Nomad, ECS) in production. You've built modules, managed state, debugged drift, and managed production incidents.

  • Hands-on Kubernetes at meaningful scale - You've operated Kubernetes clusters past the tutorial stage: real workloads, real traffic, real on-call. You can speak to ingress, RBAC, and autoscaling.

  • Strong Linux fundamentals - We're a low-level infrastructure product. You're comfortable deep in Linux: networking, namespaces and cgroups, systemd, filesystems, and debugging at the OS level. When a deployment breaks, the root cause is often below the orchestrator, and you can follow it there.

  • Comfortable working directly with customers - A chunk of BYOC is customer-facing: pre-sales technical discussions, deployment, and operational support for enterprise customers. You can hold a technical conversation with a customer and own the relationship through a rollout, not just hand off a config.

  • Multi-cloud comfort, with one cloud at expert depth - You've built and operated on Google Cloud, AWS, or Azure. Configured VPCs, managed IAM, set up private networking, debugged routing and DNS. We're primarily on GCP today and expanding; bring depth in at least one and be willing to ramp on the others.

  • Terraform as a first-class skill - You read other people's modules without flinching, write your own when needed, and have opinions about state, workspaces, and module boundaries.

  • Comfortable reading and writing code when the work crosses out of YAML - You won't be writing product Go. But you should read Go and Terraform fluently, contribute small fixes, and not be lost when a debug session moves from config into source.

  • Startup-pace tolerance - You've worked at a small team where ambiguity is the default and "figure it out" is the assignment. You don't need a ticket to start, and you push back when something doesn't make sense.

  • Familiarity with high scale environments - Either a high-traffic online product where you owned scaling problems, or an infrastructure-product company where infra was the product. Both work.

  • Excited to work in person from San Francisco on a DevTool product - We work as a team in person and the work moves faster when we're in the same room.

Bonus points for

  • Deep cloud expertise in AWS, GCP, or Azure

  • Experience building self-hosted or BYOC deployments for enterprise customers

  • Contributions to open-source infrastructure projects (Terraform providers, Kubernetes operators, Cloudflare modules)

  • Experience helping infra-side AI tooling (Claude, Codex) carry real production migrations

What it’s like to work at E2B

We’re a fast-growing startup with in-person (4 days on-site, 1 day WFH) offices in San Francisco and Prague, Czech Republic. We already generate 8-figure revenue and work directly with top-tier AI companies like Perplexity, Hugging Face, and other exciting teams pushing the frontier of AI.

We cover full healthcare, vision, and dental insurance, and offer unlimited PTO.

Skills Required

  • 5+ years operating production cloud infrastructure
  • Hands-on Kubernetes at meaningful scale
  • Multi-cloud comfort, with one cloud at expert depth
  • Terraform as a first-class skill
  • Comfortable reading and writing code
  • Familiarity with high scale environments
  • Excited to work in person from San Francisco
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
17 Employees
Year Founded: 2023

What We Do

Code interpreting for your AI apps | Safely run AI-generated code in E2B sandbox

Similar Jobs

Zoox Logo Zoox

Staff Software Engineer

Artificial Intelligence • Machine Learning • Robotics • Software • Transportation • Design • Manufacturing
Hybrid
Foster City, CA, USA
2900 Employees
250K-300K Annually

Andromeda (andromeda.ai) Logo Andromeda (andromeda.ai)

Site Reliability Engineer

Artificial Intelligence • Cloud • Information Technology • Software
In-Office or Remote
3 Locations
17 Employees

MongoDB Logo MongoDB

Site Reliability Engineer

Big Data • Cloud • Software • Database
Easy Apply
Remote or Hybrid
5 Locations
5550 Employees
127K-249K Annually

Andromeda (andromeda.ai) Logo Andromeda (andromeda.ai)

Senior Site Reliability Engineer

Artificial Intelligence • Cloud • Information Technology • Software
In-Office or Remote
8 Locations
17 Employees

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account