In this role, you will:
- Build the Zoox Training framework leveraged by all ML teams within Zoox. This framework needs to be highly scalable, reliable, and efficient.
- Build the Off-vehicle inference service powering our Foundational models and the models that improve our rider experiences.
- Lead the design, implementation, and operation of a robust and efficient ML platform to enable the training, validation, serving, and monitoring of ML models.
- Collaborate closely with cross-functional teams, including ML researchers, software engineers, and data engineers, to define requirements and align on architectural decisions.
- Enable the junior engineers in the team to grow their careers by providing technical guidance and mentorship.
Qualifications
- 4+ years of ML infrastructure experience.
- Experience building large-scale distributed multi-node GPU model training and/or high throughput, low latency serving use cases.
- Experience with training frameworks like PyTorch and OSS frameworks like Ray.
- Experience with GPU-accelerated inference using TensorRT, Ray Serve, or a similar framework.
- Experience working with cloud providers like AWS.
Top Skills
What We Do
Zoox is an autonomous mobility company that was founded to provide a safer, cleaner, and more enjoyable future on the road. To achieve that goal, the company has spent the past 10 years creating a purpose-built robotaxi that gives the world a better way to ride.
Why Work With Us
At Zoox, we are working to solve one of the greatest technological challenges of our generation.
From the beginning, we have been focused on our goal of reimagining transportation from the ground up. We are a mission-driven community of innovators working together to create a safer, cleaner, and more enjoyable future on the road.
Gallery








