Software Engineer, Robotics & VLA Systems Mountain View, CA · Full-Time · Physical AI Team
About Deccan AI
Deccan AI is a model training and evaluation startup (Mountain View, CA; delivery center in Hyderabad). Founded by IIT Bombay, IIM Ahmedabad, and ex-Google alumni and backed by Prosus Ventures, we build expert-curated datasets and evaluation infrastructure for frontier AI labs including Google DeepMind and Snowflake. Our Physical AI practice is building the data backbone for embodied AI — annotation, synthetic data generation, and model evaluation that help robots learn manipulation and reasoning at scale.
The Role
VLA (Vision-Language-Action) foundation models need more than research breakthroughs — they need production software that connects data pipelines, simulation, training loops, and evaluation into a reliable system. You'll build that system. This is not a pure ML research role or a data engineering role — it sits at the intersection, where you write the code that makes VLA training work end-to-end. You'll work directly with frontier lab clients and ship systems used to train the next generation of robot foundation models.
What You'll Do
Build the VLA training integration layer — data loaders, format converters, and preprocessing that feed curated datasets (real + synthetic) into training frameworks (LeRobot, Octo, OpenVLA, π0).
Develop evaluation and benchmarking infrastructure: sim-based rollouts in Isaac Lab, success-rate tracking, regression detection, and automated reporting for client delivery.
Own dataset management — versioning, schema validation, metadata indexing, and format conversion across Open X-Embodiment, LeRobot HDF5, RLDS, and client-specific formats.
Implement sim-to-real transfer tooling: domain randomization configs, Cosmos Transfer integration, and quality validation ensuring synthetic data improves policy performance.
Build annotation platform backend systems — task taxonomy APIs, temporal segmentation, quality scoring, and inter-annotator agreement for robotics episode data.
Integrate with NVIDIA Isaac ecosystem (Isaac Sim, Isaac Lab, GR00T-Mimic, OSMO) to orchestrate synthetic data generation on cloud GPU infrastructure.
Required
MS or PhD in CS, Robotics, or ML (or equivalent industry experience). 2+ years shipping production systems — not just prototypes.
Strong Python and systems skills. Comfortable with Linux, Docker, CUDA, and cloud infrastructure (AWS/GCP).
Working knowledge of robot learning: imitation learning, behavior cloning, diffusion policies, or RL. You need to understand how VLA training works, not just the theory.
Experience with at least one robotics sim (Isaac Sim, MuJoCo, PyBullet, Gazebo) and robotics data formats/middleware (ROS/ROS2, URDF/USD, MCAP, HDF5).
Clean, tested, documented code. CI/CD and code review experience expected.
Preferred
Hands-on with VLA codebases: RT-2, Octo, OpenVLA, π0, GR00T N1, or LeRobot.
Experience with NVIDIA Omniverse, Replicator, or Cosmos for synthetic data; sim-to-real transfer techniques (domain randomization, NeRF, Gaussian Splatting).
Built ML eval frameworks, model benchmarking suites, or RLHF/DPO training pipelines.
Open-source contributions in robotics or ML. Published research in manipulation, embodied AI, or VLA systems is a plus.
Why This Role
Founding-stage impact. You're joining the robotics practice at inception. Your systems define how Deccan delivers physical AI data for years.
Frontier lab clients. Your work ships directly to teams like Google DeepMind. Few roles put you this close to the cutting edge of embodied AI.
Full-stack ownership. Annotation backends, VLA training integration, sim-based eval — you own entire systems, not isolated tickets.
Skills Required
- MS or PhD in Computer Science, Robotics, or Machine Learning (or equivalent industry experience)
- 2+ years shipping production systems
- Strong Python and systems skills
- Experience with Linux, Docker, and cloud infrastructure (AWS/GCP)
- Working knowledge of robot learning techniques
- Experience with robotics simulations and data formats
- Experience in writing clean, tested, documented code
What We Do
Deccan AI is a Bay Area-based company specializing in providing high-quality, human-curated datasets and AI-enabled operations to enhance the performance of large language models (LLMs) and foundation models for leading AI labs and enterprises.







