Staff Machine Learning Infrastructure Engineer

Reposted 21 Days Ago
Be an Early Applicant
Redwood City, CA, USA
In-Office
220K-320K Annually
Senior level
Robotics
We are at the forefront of revolutionizing robotic manipulation
The Role
The role involves designing and maintaining large-scale ML infrastructure, optimizing distributed training systems, and enhancing computing performance for model training.
Summary Generated by Built In
Join us to shape the next frontier of AI-driven robotics!

Dyna Robotics makes general-purpose robots powered by a proprietary embodied AI foundation model that generalizes and self-improves across varied environments with commercial-grade performance. Dyna's robots have been deployed at customers across multiple industries. Its frontier model has the top generalization and performance in the industry.

Dyna Robotics was founded by repeat founders Lindon Gao and York Yang, who sold Caper AI for $350 million, and former DeepMind research scientist Jason Ma. The company has raised over $140M, backed by top investors, including CRV and First Round. We're positioned to redefine the landscape of robotic automation.

Position Overview

As a Lead ML Infrastructure Engineer, you are the architect of our "Training Engine." You will bridge the gap between raw hardware and cutting-edge research, ensuring that our ML team can iterate at lightning speed without friction. Your goal is simple: maximize the "intelligence-per-watt" by optimizing every millisecond of the training and inference pipeline.

What You’ll Do
  • Scale Distributed Training: Architect and own the infrastructure for large-scale GPU clusters. You’ll implement sharding, activation checkpointing, and memory optimization (ZeRO, FSDP) to enable the training of massive multimodal models.

  • Optimize Researcher Ergonomics: Build a research codebase and job scheduling system (Kubernetes/SLURM) that prioritizes fast iteration, automated retries, and seamless failure recovery.

  • High-Performance Data Handling: Design high-throughput pipelines to ingest and transform terabytes of multimodal robot data (video, proprioception, 3D signals), ensuring dataloaders never starve the GPUs.

  • Production Inference: Build low-latency inference pipelines for real-time robot control. You’ll apply quantization, distillation, and model compilation (TensorRT, Triton) to move models from the lab to the physical world.

  • Deep Systems Profiling: Dive into the weeds of GPU utilization, I/O bottlenecks, and memory fragmentation to squeeze every bit of performance out of our expanding compute fleet.

What You’ll Bring
  • 7+ Years of Engineering: With a track record of leading technical projects in high-performance computing (HPC) or ML infrastructure.

  • ML Systems Mastery: Deep experience with PyTorch and distributed training frameworks (DeepSpeed, Accelerate). You understand the nuances of mixed precision and gradient accumulation.

  • Infrastructure Expertise: Hands-on experience managing cloud GPU environments (GCP/AWS) and container orchestration (Kubernetes).

  • Low-Level Intuition: A fundamental understanding of distributed systems, including race conditions, memory management, and NCCL/inter-node communication.

  • Ownership Mindset: You don't just "deploy" code; you design, build, and operate systems end-to-end to unblock fast-moving research.

Bonus Points For
  • Experience with Robotics Data Formats (MCAP, Protobuf) or multimodal models (VLAs).

  • Deep ML systems experience: custom kernels (Triton), compilers, or runtime optimization.

  • Experience as a founding or early-stage infrastructure hire.

At Dyna Robotics, we build technology for the real world, which requires a team as diverse as the environments our robots inhabit. We are an equal opportunity employer committed to technical rigor and mutual respect.

Don’t let a checklist stop you. Data shows that underrepresented groups often only apply if they meet 100% of the criteria. We value problem-solving and grit over keyword matching. If you’re passionate about the intersection of geometry and robotics, we want to hear from you—even if you don't check every box.

Top Skills

Accelerate
AWS
Distributed Systems
GCP
High-Performance Computing
Kubernetes
PyTorch
Tensorrt
Triton
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
0 Employees

What We Do

Our mission is to empower businesses by automating repetitive, stationary tasks with affordable, intelligent robotic arms. Leveraging the latest advancements in foundation models, we're driving the future of general-purpose robotics—one manipulation skill at a time

Similar Jobs

ServiceNow Logo ServiceNow

Machine Learning Engineer

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Santa Clara, CA, USA
28000 Employees
173K-303K Annually

ServiceNow Logo ServiceNow

Machine Learning Engineer

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Santa Clara, CA, USA
28000 Employees
173K-303K Annually

Samsara Logo Samsara

Machine Learning Engineer

Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
Easy Apply
Remote or Hybrid
United States
4000 Employees
200K-358K Annually

General Motors Logo General Motors

Infrastructure Engineer

Automotive • Big Data • Information Technology • Robotics • Software • Transportation • Manufacturing
Remote or Hybrid
Sunnyvale, CA, USA
165000 Employees
189K-291K Annually

Similar Companies Hiring

Apptronik Thumbnail
Software • Robotics • Machine Learning • Hardware • Computer Vision
Austin, TX
180 Employees
Doodle Labs Thumbnail
Wearables • Robotics • Internet of Things • Hardware • Automation • App development • Aerospace
SG
50 Employees
Fairly Even Thumbnail
Hardware • Other • Robotics • Sales • Software • Hospitality
New York, NY
30 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account