What we’re doing isn’t easy, but nothing worth doing ever is.
We envision a future powered by robots that work seamlessly with human teams. We build artificial intelligence that enables service robots to collaborate with people and adapt to dynamic human environments. Join our mission-driven, venture-backed team as we build out current and future generations of humanoid robots.
The MLOps Engineer works together with engineering teams, IT, and Security to address unique business challenges through comprehensive solutions while taking into account system uptime, reliability, and maintainability. Instrument and monitor the breadth of our full platform stack (hosts, applications, and performance). In this role you will work closely with our engineering and information security teams to enhance the automated system provisioning and deployment subsystems within codified infrastructure. You will work with developers to create more robust and scalable services independent of cloud implementations. You will help to isolate, trap, and respond from the inevitability of system failure and develop strategies for continuous monitoring and analysis to reduce both downtime and required manual intervention. You will participate in On-Call rotation to maintain platform SLAs.
Key Responsibilities
- Analyze our current operational toolset for shortcomings and product improvements; provide and implement recommendations.
- Creating, configuring and maintaining cloud-based infrastructure and services for the rapid development and monitoring of complex robotics and analytics applications.
- Build tools to automate monitoring and management of robot fleets.
- Build tools to automate and improve ML Ops tooling and workflow.
- Build tools to automate and improve data workflows for ML training and simulation.
- Triage issues as they arise, both on robots and in deployed software.
- Automate common operations to allow Diligent’s robotic fleet to scale exponentially.
- Being an active member of the software engineering team, helping to improve the organization’s SDLC process and minimizing time from code-complete to production.
- Mentor engineers in SRE best practices and modern software engineering
- Occasional off-hours, on-call work required.
Qualifications
- Bachelor’s degree in Computer Science, related field, or equivalent experience
- 5+ years of combined experience in MLOps, DevOps or Software Engineering or related technical roles.
- Deep experience in modern cloud infrastructure (AWS, Azure, GCP) especially managed ML/AI services.
- Experience with modern datastores at small to medium scale (Firestore, Redshift, Postgres, Mongo, distributed queues like Kafka, MosquittoMQ).
- Experience automating system provisioning, configuration, and Infrastructure as Code (Terraform, Ansible, etc)
- Management of hosting environment, including database administration and scaling an application to support load changes
- Experience soliciting systems requirements, designing, and implementing new platform components leveraging infrastructure or SaaS services.
- Experience working with distributed, fault tolerant systems
- Experience with converting monolithic applications to microservices and service discovery technology
- Solid Linux skills and proficiency in at least one high-level language (i.e. Python).
- Experience working in an agile methodology development lifecycle
Top Skills
What We Do
We are a human-centered robotics company. Our mission is to make technical advances towards robots and humans working together side by side, with an emphasis on human-centric design. Diligent Robotics is developing a suite of artificial intelligence that enables robots to collaborate with and adapt to humans in everyday environments.
Our service robots are designed to participate and work together with teams of humans. Our first product is a hospital service robot that can assist clinical staff with logistical tasks, allowing them to spend more of their time on direct patient care, improving patient satisfaction, quality of service, and safety.








