AI Infra Engineer

Posted Yesterday
Be an Early Applicant
San Francisco, CA, USA
In-Office
200K-275K Annually
Senior level
Artificial Intelligence • Healthtech • Machine Learning • Software
The Role
Build and operate scalable AI infrastructure for hospital deployments: set up networking and VPNs, manage AWS multi-account infrastructure (ECS/EKS, Terraform), build data pipelines (S3, Iceberg, Spark, Dagster, Airflow), productionize ML models, and implement observability to ensure reliable daily risk-ranking for clinicians.
Summary Generated by Built In
About Healthleap

HealthLeap builds AI that helps clinicians prioritize patients, surfaces the right data, and gets patients the care they need earlier, so they can leave the hospital sooner.

We integrate with hospital electronic health record systems, screen 100% of patients daily, and risk-rank them in real time. Clinicians at Cedars-Sinai and Penn Medicine start every morning with HealthLeap — with Houston Methodist, Emory, and Intermountain Health deploying now.

Real results: 39% more diagnoses. 4 days earlier detection. $11M/year ROI for our first site at Cedars Sinai. 7× revenue growth in 7 months.

We started with malnutrition. We're expanding to every major condition to ensure no patient falls through the cracks. Sequoia and First Round are backing us to build the platform that screens every patient for everything and drives tangible outcomes.

We're ~15 people. >$7M raised. SF-based, hybrid-friendly. Early enough to shape the product. Late enough to know it works. Results that are changing lives.

About the Role

Build the infrastructure that screens every hospital patient, every day.

HealthLeap processes billions of data points from hospital EHRs. Data flows through secure connections, transforms through data pipelines, and powers ML models that clinicians rely on every morning. You'll work alongside our data scientists and ML engineers to build and operate the infrastructure that makes this possible.

Why you

You've built AI infrastructure at a startup where you owned everything. The person who set up the AWS accounts, wrote the Terraform, deployed the ML models, and debugged the VPN at 2am. You've taken models from Jupyter notebooks to production. You're the person who figures things out when the docs don't help.

What you'll do
  • Deploy new hospitals: site-to-site VPNs, networking, container services, data pipelines

  • Build reliable data infrastructure that ingests, transforms, and serves data at scale (S3, Iceberg, Spark, Dagster, Airflow)

  • Partner with data scientists and ML engineers to productionize models and get them running reliably at scale

  • Own AWS infrastructure: multi-account orgs, ECS/EKS, Terraform, networking

  • Set up observability: monitoring, alerting, logging that actually helps debug issues

  • Make hospital deployments faster and more reliable

What we're looking for
  • 5+ years engineering experience, with significant infrastructure work

  • Deep AWS experience: networking, IAM, ECS/EKS, S3, and the rest

  • Solid with Terraform or similar infrastructure-as-code

  • Experience with data pipelines and orchestration (Dagster, Airflow, or similar)

  • You've operated production systems, not just built them

  • You figure things out, even when documentation doesn't exist

Nice to have
  • MLOps experience: deploying and monitoring ML models in production

  • Healthcare data standards (FHIR, HL7v2)

  • Experience with Kubernetes, service mesh, or complex networking

  • Background at an early-stage startup where you owned infra end-to-end

This role is NOT for you if
  • Startup unpredictability feels like chaos to you. We find it exciting.

  • You've never owned something end-to-end. Here, you own outcomes, not tasks.

  • You wait to be told what to do next. We need people who see what's needed and do it.

  • You're looking for a 9-5 with predictable hours. We care about deep work and deep rest (we have a minimum leave policy), but when a hospital go-live is on the line, we show up - even if that means a 60+ hour week.

  • You've only worked on platform teams at large companies. We need someone who's built infra from scratch.

  • You see collaboration as interruption. We see it as leverage. The best engineers here are constantly pulling each other in. If you prefer to work in isolation, this isn't the right fit.

Interview process
  1. Intro call - Get to know each other

  2. Technical - 1-2 interviews

  3. Onsite - Coding, case study, team meet (~4-5 hours)

  4. Decision - Same week as onsite

We respect your time. If there's a fit, you'll know fast.

Compensation & Benefits
  • Salary: $200,000 - $275,000 base

  • Equity: Meaningful ownership in an early-stage company

  • Healthcare: 100% of premiums covered

  • PTO: Unlimited, with a recommended minimum of 20 days

  • 401(k): 4% match

  • Equipment: Laptop + budget for your home office

Location

San Francisco - in person.

If you're passionate about applying frontier AI to real-world impact, join us in building healthcare's future.

Skills Required

  • 5+ years engineering experience with significant infrastructure work
  • Deep AWS experience (networking, IAM, S3, multi-account orgs)
  • Experience with ECS or EKS (container orchestration on AWS)
  • Solid experience with Terraform or similar infrastructure-as-code
  • Experience with data pipelines and orchestration (Dagster, Airflow, or similar)
  • Experience with large-scale data technologies (Iceberg, Spark)
  • Experience operating production systems end-to-end (deployments, debugging, on-call)
  • Ability to set up site-to-site VPNs and complex networking for hospital deployments
  • Willingness/ability to work in-person in San Francisco
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
20 Employees
Year Founded: 2022

What We Do

HealthLeap is an AI-driven clinical decision support platform designed to improve patient outcomes and hospital margins. It serves as an AI screening 'safety net' for hospitalized patients, primarily focusing on daily malnutrition screening to identify high-risk inpatients earlier than manual tools. By surfacing patients who need care, it enables timely clinician intervention, reduces hospital readmission rates, and improves billing accuracy for hospitals.

Similar Jobs

Cox Exponential Logo Cox Exponential

Founding Engineer, AI Infra

Angel or VC Firm • Artificial Intelligence
Remote or Hybrid
8 Locations
20 Employees

General Motors Logo General Motors

Machine Learning Engineer

Automotive • Big Data • Information Technology • Robotics • Software • Transportation • Manufacturing
Hybrid
2 Locations
165000 Employees
276K-341K Annually

Bravebird AI Logo Bravebird AI

Founding Engineer - Applied AI Infra

Artificial Intelligence • Healthtech • Software • Automation
Remote or Hybrid
San Francisco, CA, USA

Pylon (usepylon.com) Logo Pylon (usepylon.com)

Software Engineer

Artificial Intelligence • Software
In-Office
San Francisco, CA, USA
43 Employees
180K-300K Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account