The Role
Design, build, and operate cloud and edge data pipelines for robotic systems. Manage large multimodal datasets, implement on-device capture (MCAP/Protobuf), automate MLOps/DataOps workflows, ensure data quality, and build tools for debugging, experimentation, and production deployments across AWS and edge devices.
Summary Generated by Built In
Company Overview
We’re building intelligent robotic arms that can learn new skills in hours, not months. Backed by Y Combinator and top-tier Silicon Valley investors, we’re turning physical AI into reality, helping industries facing critical labor shortages (manufacturing, logistics, and more) automate back-of-house tasks like packaging, kitting, and assembly.
Our flagship robot combines affordable robotic hardware with cutting-edge imitation learning algorithms, enabling reliable, sample-efficient robots that deliver customer value from day one. We’re already live with pilot partners and scaling fast. The founding team brings experience from Apple, Stanford, and Microsoft, with deep expertise in robotics, embodied AI, and large-scale machine learning.
The Role
We’re looking for a Robotics Data Infrastructure Engineer to own and build the data systems that power our robots in the real world. This is a hands-on founding engineer role with true ownership and freedom — your work will directly impact robots performing customer-critical tasks every day.
You will architect and deploy data pipelines on both AWS and edge devices, manage large-scale multi-modal datasets (images, video, time-series, text, etc.), and build the tooling that connects real-world robot data to training and evaluation workflows. You’ll work across the full robotics software stack, from ingesting sensor data and telemetry, to enabling large-scale policy learning pipelines that drive production robots.
Beyond writing great code, you’ll help drive technical decisions, lead cross-functional efforts, and bridge robotics, machine learning, and product requirements into scalable, reliable systems.
What You’ll Do
- Build and own our data backbone on AWS: Design and run cloud + edge pipelines using services like IoT Core, S3, ECR, Batch, ECS/EKS, and Step Functions. Your work keeps robot data flowing reliably and cost-efficiently from the field into the lab.
- Develop on-device data systems: Build robust, fault-tolerant data capture on edge PCs using MCAP/Protobuf, with clean schema contracts, buffering, and resumable uploads to the cloud.
- Wrangle massive multimodal datasets: Organize and version millions of images, videos, time-series (robot state, force/torque), and annotations. Enforce metadata, retention, and access patterns that scale.
- Build MLOps and DataOps pipelines: Automate data validation, labeling, augmentation, and model training/evaluation using containerized jobs and orchestrators like Batch, Step Functions, Airflow, or Prefect.
- Ensure data quality and health: Create ingestion checks, schema validation, deduping, drift detection, and real-time alerting around data freshness and completeness.
- Build internal tools that unblock others: Develop UIs/CLIs for browsing data, launching jobs, tracking experiments, and debugging robots in the field. Integrate with tools like Foxglove.
- Work across teams: Partner with hardware, ML, and product to turn raw field data into smarter robots and real customer value—fast.
Qualifications
- B.S., M.S., Ph.D. in computer science or related fields.
- Strong programming skills in Python (you write clean, efficient, production-ready code)
- Strong experience in AWS
- Systems engineering skills (networking, concurrency, performance)
- At least 2 years of full-time work experience for candidates with a B.S. in related fields. 1 year of experience for M.S. or Ph.D. candidates.
The base pay range for this role is $110,000 – $175,000 per year.
Skills Required
- B.S., M.S., or Ph.D. in computer science or related field.
- Strong programming skills in Python (production-ready code).
- Strong experience in AWS (IoT Core, S3, ECR, Batch, ECS/EKS, Step Functions).
- Systems engineering skills (networking, concurrency, performance).
- At least 2 years full-time experience for B.S. candidates or 1 year for M.S./Ph.D. candidates.
- Experience building on-device data capture and upload systems using MCAP and Protobuf.
- Experience designing and operating MLOps/DataOps pipelines (Airflow, Prefect, containerized jobs, orchestration).
- Experience organizing, versioning, and enforcing metadata/retention for large multimodal datasets (images, video, time-series, annotations).
- Ability to build internal UIs/CLIs and integrate debugging/telemetry tools (e.g., Foxglove).
Am I A Good Fit?
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.
Success! Refresh the page to see how your skills align with this role.
The Company
What We Do
BLD Talent is a specialized recruiting firm that connects technical and product talent with startups ranging from seed to Series B stages. Operating as a high-signal, solo-operator agency, they focus on placing software, product, and hardware professionals into growth-stage and established teams. Their mission is to streamline the hiring process by providing direct, expert-led recruiting services without the complexity of traditional agency handoffs.





.jpg)


