ML Ops Engineer (Boston, MA)

Posted 21 Days Ago
Hiring Remotely in Boston, MA, USA
In-Office or Remote
Senior level
Artificial Intelligence • Design • Generative AI • Manufacturing
Engineering General Intelligence
The Role
Architect and manage ML pipelines for training and deployment on cloud platforms, automate CI/CD, and ensure system performance and availability.
Summary Generated by Built In
Requirements:
 
  • Architect, build, and operate end-to-end ML pipelines for training, validation and deployment on Google Cloud and AWS.
  • Define, instrument, and maintain logging, monitoring, and alerting for model performance and data drift.
  • Automate CI/CD for ML artifacts and infrastructure using GitHub Actions or equivalent.
  • Collaborate with cross-functional teams, including frontend engineers, backend engineers, research engineers, and infrastructure engineers.
  • Write clean, well-documented, fast, and maintainable code.
  • Help ensure our systems have high availability and performance.
  • Experience in computer graphics or physics-based simulation.
  • Background in setting up Prometheus/Grafana, ELK, or similar monitoring stacks.
  • Experience with Vertex AI.
  • Experience working with custom Domain-Specific Languages.
About Us: 
 
We are an MIT-born, venture-backed Silicon Valley startup building a real-life 'Jarvis'—an AI Copilot for design and manufacturing. Our goal is to utilize advanced AI, physics simulation, and computer graphics to reduce costs and improve engineering productivity across all steps of the design and manufacturing process.

What we're looking for

  • BS in Computer Science or a related field.
  • 5+ years of experience as a AI/ML Ops, DevOps, Infrastructure Engineer or equivalent.
  • Expert-level Python and TypeScripts skills.
  • Experience with Docker, Kubernetes, Terraform, Google Cloud and AWS.
  • Deep understanding of machine learning models, including LLMs.
  • Experience designing and maintaining CI/CD pipelines to fine-tune or train ML models.
  • Excellent written and verbal communication skills.

Bonus Points

  • Experience in computer graphics or physics-based simulation.
  • Background in setting up Prometheus/Grafana, ELK, or similar monitoring stacks.
  • Experience with Vertex AI.
  • Experience working with custom Domain-Specific Languages.

Our tech stack

  • Google Cloud, AWS
  • Python, TypeScript
  • Protobuf, gRPC
  • Next.JS, React.JS
  • GitHub Actions
  • Docker, Kubernetes, Spinnaker
  • PostgreSQL

Skills Required

  • 5+ years of experience as a AI/ML Ops, DevOps, Infrastructure Engineer or equivalent
  • Expert-level Python and TypeScript skills
  • Experience with Docker, Kubernetes, Terraform, Google Cloud and AWS
  • Deep understanding of machine learning models, including LLMs
  • Experience designing and maintaining CI/CD pipelines to fine-tune or train ML models
  • BS in Computer Science or a related field
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Los Altos, California
27 Employees
Year Founded: 2023

What We Do

Meet the Al platform to Revolutionize Engineering. We started this company because manufacturing and engineering are overdue for a digital revolution. These fields are complex, and while agile, lean, and just-in-time methods have driven progress, AI now offers a transformative opportunity to reshape how work gets done. Foundation’s EGI platform is built to accelerate every stage of the product development cycle—from research and design to manufacturing and documentation—empowering engineers to build better products, faster and more efficiently.

Similar Jobs

Remote
United States
603 Employees
140K-150K Annually

Babylist Logo Babylist

Senior Manager, Planning

eCommerce • Healthtech • Kids + Family • Retail • Social Media
Easy Apply
Remote or Hybrid
United States
300 Employees
141K-169K Annually

CrowdStrike Logo CrowdStrike

Field Tech Strategist (Remote, East Coast)

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
20 Locations
10000 Employees
155K-240K Annually

Optum Logo Optum

Senior Healthcare Data Analytics Lead - Remote

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office or Remote
Newton Centre, MA, USA
160000 Employees
92K-164K Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
31 Employees
LTX Thumbnail
Conversational AI • Generative AI
Jerusalem, Israel
360 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account