AI Infrastructure Engineer

Reposted 12 Hours Ago
Be an Early Applicant
Menlo Park, CA, USA
In-Office
Mid level
Artificial Intelligence • Robotics • Automation • Manufacturing
The Role
The AI Infrastructure Engineer will design and operate data and compute systems, manage GPU infrastructure, and develop data pipelines for machine learning models.
Summary Generated by Built In

ABOUT MATTER

Matter is building the AI-native autonomy stack for physical manufacturing in the United States. We operate our own factories, deploy our own software, and collect data from every stage of production — from CAD intake to finished goods.

Our platform, MatterOS, is the unified software layer for factory operations, process orchestration, and autonomy deployment. The data pipeline that feeds it — from machine telemetry on the floor to model training in the cloud — is the infrastructure you will build and own.

 

THE ROLE

We are hiring an AI Infrastructure Engineer to design and operate the data and compute systems that power MatterOS and our Sim2Real training pipeline. You will work across edge computing, cloud training infrastructure, and the data pipelines that make our “Smart Data” strategy real.

Your job is to ensure that every data point — from a torque sensor reading to a camera frame — is tagged with the machine ID, process state, and production context that makes it trainable.

 

WHAT YOU’LL DO

•      Design and maintain the edge-to-cloud data pipeline with semantic context preserved end-to-end

•      Build and manage GPU compute infrastructure for VLA model training, experiment tracking, and distributed training workflows

•      Implement the data collection layer for 100% capture from modular assembly workcells, including camera feeds, sensor streams, machine state, and process metadata

•      Develop feature engineering pipelines that transform raw operational data into structured training inputs for AI models

•      Manage model deployment to edge hardware in the factory: latency, versioning, rollback, and monitoring in production

•      Build observability systems that surface model performance degradation, data drift, and equipment anomalies in real time

•      Collaborate with AI researchers to translate model requirements into infrastructure specifications and vice versa

 

WHAT WE’RE LOOKING FOR

•      3+ years of experience in ML infrastructure, MLOps, or data engineering in a production environment

•      Strong command of distributed data systems: Kafka, Flink, or equivalent; time-series databases (InfluxDB, TimescaleDB, or similar)

•      Experience with GPU cluster management and distributed training (SLURM, Ray, or Kubernetes-based)

•      Familiarity with industrial protocols: OPC UA, MQTT, Modbus (or willingness to learn quickly)

•      Proficiency in Python; comfort with C++ or Rust for performance-critical edge components is a plus

•      Systems thinking: you understand that data quality, not data volume, is what makes AI work in constrained physical environments

 

NICE TO HAVE

•      Experience with NVIDIA Isaac Sim, ROS2, or edge AI deployment (Jetson, FPGA, or similar)

•      Background in industrial IoT or factory automation systems

•      Familiarity with model serving frameworks (Triton, TorchServe, or ONNX Runtime)

 

WHY MATTER

Most AI infrastructure roles are about keeping existing systems running. At Matter, you are building the infrastructure from scratch for a category that doesn’t fully exist yet: autonomous physical manufacturing.

Skills Required

  • 3+ years of experience in ML infrastructure, MLOps, or data engineering in a production environment
  • Strong command of distributed data systems: Kafka, Flink, or equivalent; time-series databases
  • Experience with GPU cluster management and distributed training
  • Familiarity with industrial protocols: OPC UA, MQTT, Modbus
  • Proficiency in Python
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
101 Employees
Year Founded: 2025

What We Do

Matter builds and operates autonomous factories for complex, mission-critical hardware, leveraging AI-native systems and modular automation to re-industrialize America.

Similar Jobs

NVIDIA Logo NVIDIA

Senior Software Engineer

Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
In-Office
2 Locations
21960 Employees
184K-357K Annually

Hyphen Connect Limited Logo Hyphen Connect Limited

Platform Engineer

Agency • Artificial Intelligence • Blockchain • Web3
In-Office or Remote
8 Locations
7 Employees

Hyphen Connect Limited Logo Hyphen Connect Limited

LLM Pre-training & Distributed Engineer (AI Infrastructure)

Agency • Artificial Intelligence • Blockchain • Web3
In-Office or Remote
8 Locations
7 Employees

Anrok Logo Anrok

Software Engineer

Payments • Sales • Software • Financial Services
Hybrid
3 Locations
60 Employees

Similar Companies Hiring

Amalgamated Sugar Thumbnail
Food • Greentech • Agriculture • Industrial • Manufacturing
Boise, Idaho
768 Employees
Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account