The Role
As a Senior AI Engineer, you'll design data pipelines and apply self-supervised learning on multi-modal sensor data to enhance robotics models.
Summary Generated by Built In
RIVR is a Swiss robotics company pioneering Physical AI and robotic solutions to revolutionize last-mile delivery, giving 1 human the power of 1000. Through the combination of artificial neural networks and innovative robot designs with wheels and legs, RIVR aims to enhance efficiency, sustainability, and scalability in last-mile delivery. Founded as Swiss-Mile, the company rebranded to RIVR in 2025 to better reflect its mission of driving the future of intelligent robotics. Join our innovative team, renowned for pioneering robotic design and neural network applications in robotics that improve environmental understanding and decision-making. With a robust research foundation and notable contributions from ETH Zurich, we are leaders in translating artificial intelligence and robotics into practical, real-world applications.
RIVR is committed to building a diverse and inclusive team that values every perspective. If you’re passionate about driving innovation in robotics and creating meaningful impact, we encourage you to apply and bring your unique self to our team.
Job Description
Our global fleet of autonomous robots operates in the real world, generating vast amounts of multi-modal sensor data. While our VLA team focuses on building large-scale models to consume this data, much of it remains unlabeled and unstructured. We are seeking an expert in self-supervised and representation learning to unlock the full potential of this massive data pool.
In this role, you will be responsible for designing and building the core data engine that transforms raw, real-world sensor data into high-signal, structured datasets suitable for training neural networks. You will pioneer methods to automatically curate, filter, and pseudo-label this data, creating powerful representations that serve as the foundation for all downstream tasks, including navigation, imitation learning, and decision-making.
You will work directly with the VLA and Reinforcement Learning teams to define data strategies and interfaces, ensuring the data you produce directly accelerates their model development. If you are passionate about solving the "data bottleneck" in robotics and want to build the systems that learn meaningful patterns from the physical world, we invite you to join us.
What you’ll be doing
- Design, build, and maintain scalable data pipelines to process, filter, and transform terabytes of raw, multi-modal sensor data (e.g., video, LiDAR, IMU, odometry) from our robotic fleet.
- Develop and implement state-of-the-art self-supervised and representation learning algorithms to automatically extract features, discover patterns, and generate pseudo-labels from our unlabeled data.
- Collaborate closely with the VLA Foundation Model and RL teams to define data requirements, APIs, and strategies for leveraging curated datasets and learned representations.
- Architect and implement robust evaluation strategies, benchmarks, and datasets to rigorously track the performance and quality of both the data pipeline and the downstream models that consume it.
- Own the data integration workflow, creating efficient data loaders and access patterns to make high-signal data readily available for model training and experimentation.
- Research and prototype novel techniques in data curation, active learning, and anomaly detection to continuously improve the quality and efficiency of our data engine.
What you must have
- Master’s degree or higher in a relevant field such as Computer Science, Machine Learning, or Robotics.
- A minimum of three years of industry or research experience, with PhD experience applicable.
- Deep expertise in self-supervised learning (SSL) and representation learning, particularly with multi-modal sensor data (e.g., contrastive learning, masked autoencoders, world models).
- Proven experience in building and managing large-scale data processing pipelines for machine learning (e.g., using Spark, Kubeflow, or similar cloud-native tools).
- Strong understanding of robotic sensor data (e.g., camera, LiDAR, IMU, odometry) and their characteristics.
- Strong programming skills in Python and deep experience with PyTorch, including creating custom and efficient DataLoaders.
- Experience with MLOps best practices and data versioning tools (e.g., DVC, Pachyderm)
Get some bonus points
- PhD degree in Robotics, Engineering, Computer Science, Machine Learning or a similar discipline, or an equivalent amount of research experience.
- Publications at top-tier ML or robotics conferences (e.g., NeurIPS, ICML, CVPR, CoRL, ICLR).
- Experience with generative models (e.g., GANs, Diffusion Models) for data augmentation or simulation.
RIVR is committed to building a diverse and inclusive team that values every perspective. If you’re passionate about driving innovation in robotics and creating meaningful impact, we encourage you to apply and bring your unique self to our team.
We believe the best work is done when collaborating and therefore require in-person presence in our office locations.
Top Skills
Dvc
Kubeflow
Pachyderm
Python
PyTorch
Spark
Am I A Good Fit?
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.
Success! Refresh the page to see how your skills align with this role.
The Company
What We Do
Reality in Virtual Reality Limited is a developer of Virtual Reality assets in both 360 video and photo realistic virtual reality experiences.
Offering immersive training for all industries. We scan any real-world environment and use our RiVR VR Simulation Engine and our VRM (Virtual Reality Monitor) to enable cutting edge training anywhere in the world.
With our simulation engine we can capture any location and recreate it in photorealistic virtual reality. RiVR allows users to interact with and experience these worlds, enhancing the way humans learn.









