Mentee Robotics is redefining humanoid automation with an AI-first approach, integrating cutting-edge perception, reasoning, and dexterous manipulation into a fully autonomous humanoid robot that continuously adapts and learns. Our flagship product, Menteebot v3, is designed to seamlessly integrate into industrial, logistics, and retail environments, performing complex tasks with human-like adaptability.
We are looking for an experienced Data Engineer to build the foundational data infrastructure that turns raw simulation and real-robot recordings into curated, traceable, training-ready datasets — the fuel that powers Menteebot's intelligence.
Responsibilities- As a Data Engineer at Mentee Robotics, you will own the data lifecycle for Physical AI and humanoid robotics: the pipelines and systems that move data from the moment it is captured to the moment it trains a model.
- You will design and scale the pipelines that convert large volumes of simulation and real-robot recordings into state-of-the-art datasets — handling extraction, transformation, feature schemas, statistics, and video at scale. You will build the curation systems that generate and select high-quality training data, and the release tooling that produces versioned datasets.
- You will work with various reinforcement learning and computer vision algorithms which generate realistic simulation data for humanoid robotics tasks.
- You will build robust, production-grade systems that bridge data producers and consumers, accelerate dataset iteration, and make every dataset that reaches a model trustworthy, reproducible, and traceable to its source.
- B.Sc. in Computer Science, Engineering, or a related field.
- 4+ years of hands-on experience in data engineering, backend software engineering, or data infrastructure.
- Python Expertise: Extensive experience and strong proficiency in Python — a must-have.
- Data at scale: Proven experience designing and operating data pipelines and storage/serialization formats (e.g. Parquet, HDF5, Arrow) for large datasets.
- System Architecture: Proven ability to build robust, well-tested data tooling and integrate complex software components into reliable pipelines.
Skills Required
- B.Sc. in Computer Science, Engineering, or a related field
- 4+ years of hands-on experience in data engineering, backend software engineering, or data infrastructure
- Extensive experience and strong proficiency in Python
- Experience designing and operating data pipelines and storage/serialization formats (e.g., Parquet, HDF5, Arrow) for large datasets
- Proven ability to build robust, well-tested data tooling and integrate complex software components into reliable pipelines
What We Do
A personalized AI-based robot you can mentor.









