Distributed Systems Engineer

Sorry, this job was removed at 10:06 a.m. (CST) on Tuesday, Oct 07, 2025
Be an Early Applicant
Palo Alto, CA
In-Office
150K-300K Annually
Artificial Intelligence • Software
The Role

We are looking for people with strong ML & Distributed systems backgrounds. This role will work within our Research team, closely collaborating with researchers to build the platforms for training our next generation of foundation models.

Responsibilities

  • Work with researchers to scale up the systems required for our next generation of models trained on multi-thousand GPU clusters.

  • Profile and optimize our model training code-base to achieve best in class hardware efficiency.

  • Build systems to distribute work across massive GPU clusters efficiently.

  • Design and implement methods to robustly train models in the presence of hardware failures.

  • Build tooling to help us better understand problems in our largest training jobs.

Experience

  • 5+ years of work experience.

  • Experience working with multi-modal ML pipelines, high performance computing and/or low level systems.

  • Passion for diving deep into systems implementations and understanding their fundamentals in order to improve their performance and maintainability.

  • Experience building stable and highly efficient distributed systems.

  • Strong generalist Python and Software skills including significant experience with Pytorch.

  • Good to have experience working with high performance C++ or CUDA.

Your application is reviewed by real people.

Similar Jobs

Remote or Hybrid
9 Locations
213000 Employees
27-41 Hourly

Cox Enterprises Logo Cox Enterprises

Solutions Architect

Automotive • Cloud • Greentech • Information Technology • Other • Software • Cybersecurity
Remote or Hybrid
CA, USA
50000 Employees
139K-208K Annually

Square Logo Square

Senior Data Scientist

eCommerce • Fintech • Hardware • Payments • Software • Financial Services
Remote or Hybrid
8 Locations
12000 Employees
168K-297K Annually

Square Logo Square

Strategic Account Manager

eCommerce • Fintech • Hardware • Payments • Software • Financial Services
Remote or Hybrid
8 Locations
12000 Employees
108K-203K Annually
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco Bay Area, CA
178 Employees
Year Founded: 2021

What We Do

Empowering the world to visualize ideas through pioneering research and design. Try Dream Machine for free → lumalabs.ai/dream-machine

Similar Companies Hiring

Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account