ML Ops Engineer

Posted 3 Days Ago
San Francisco, CA, USA
In-Office
175K-220K Annually
Mid level
Information Technology • Security • Automation
The Role
As an MLOps Engineer, you will manage the deployment lifecycle for AI systems, build ML infrastructure, and develop CI/CD pipelines, ensuring operational excellence for perception systems in real-world applications.
Summary Generated by Built In
Who We Are
Sauron protects your family and home, bringing the innovations of autonomous robots and self-driving cars to residential security. Our team is led by veteran operators and engineers, alumni of Sonos, Paypal, Tesla, Apple, and Google. Sauron has raised an $27M seed round led by A* and Atomic with participation from other leading venture capital firms.

The Role
We're looking for an MLOps Engineer who thrives at the intersection of perception systems, infrastructure, and real-world deployment. You'll play a key role in making sure our cutting-edge AI systems can be seamlessly deployed to homes across the country - reliably, securely, and at scale.
Your work will span everything from robust ML deployment infrastructure on the edge to networking and observability on real devices in the field. If you've ever wanted to put advanced robotics and AI into the hands of everyday people, this is the place to do it.

What You’ll Do
  • Own and evolve the deployment lifecycle for our perception systems across edge and cloud environments.
  • Design and manage highly available ML serving infrastructure, ensuring high performance, low-latency inference, and reliability in production.
  • Build resilient CI/CD pipelines for testing and pushing system updates with confidence and comprehensive fleet observability.
  • Implement and manage remote system monitoring, alerting (e.g., Prometheus, Grafana, Sentry), and debugging systems to ensure operational excellence, focusing on fleet health metrics (e.g., uptime, resource utilization, inference latency).
  • Work closely with perception and backend teams to design deployable systems that are robust in the real world.
  • Integrate and maintain experiment tracking and model management platforms (e.g., Weights & Biases, MLflow) to streamline model lineage, performance comparison, and versioning from research to production.
  • Contribute to security policy design and device authentication/attestation infrastructure for fleet safety.
  • Build and maintain internal tooling and CLI utilities to streamline the end-to-end development-to-deployment workflow, empowering the broader engineering team to ship perception systems with high velocity and minimal friction.

What You Bring
  • 3-5+ years experience in DevOps, deployment engineering, or site reliability, ideally with production ML systems or robotics.
  • Deep operational experience with Linux system administration, system packaging (e.g., Deb/RPM), and configuration management tools (e.g., Ansible, SaltStack, Chef).
  • Strong experience with ML deployment/serving frameworks and infrastructure (e.g., PyTorch Serve, custom C++ inference services).
  • Comfortable working in Linux-heavy environments with advanced shell scripting and strong knowledge of operating system internals.
  • Hands-on experience with networking fundamentals, including TCP/IP, firewalls, NAT traversal, and VPNs.
  • Prior experience with managing large-scale edge fleets, including over-the-air (OTA) updates and blue-green deployment strategies.
  • A proven track record of developing internal developer tools or CLI applications that automate complex infrastructure tasks and improve overall team productivity.

Nice to Have
  • Experience deploying AI/ML inference pipelines on bare-metal or virtualized edge hardware (e.g., using GStreamer/Deepstream pipelines, custom executables).
  • Expertise in machine learning inference engineering, including quantization and compilation (e.g., using ONNX Runtime, TensorRT), for efficient deployment to various edge hardware targets (e.g., NVIDIA Jetson, custom ARM SoCs).
  • Familiarity with writing or debugging high-performance, low-latency ML inference services in C++.
  • Exposure to remote logging, log ingestion, and distributed telemetry aggregation.
  • Previous experience in early-stage startups or fast-paced hardware/software integration environments.

Why Sauron
You’ll be joining a deeply technical team obsessed with building real-world systems that make a tangible difference in people’s lives. We move quickly, iterate relentlessly, and ship with urgency - all while holding a deep respect for software craftsmanship and system reliability. If you're looking to solve challenging problems and own major parts of the deployment stack for a category-defining product, we want to talk.

We Value
1. The Power of "We": “Align, then Accelerate”
  • We celebrate as a team and troubleshoot as a team.
  • The goal is the mission, not the credit.
2. High Challenge, Low Ego: "Respect the person, debate the idea."
  • Be ruthless with problems, but kind to people.
  • Raise the bar, lower the shield
3. Speak up: "Silence is a setback."  
  • Your perspective is a requirement, not a suggestion.
  • Speak the hard truths early so we can fix them fast.
4. Integrity in Motion: "Own the outcome, not just the task."
  • Do what you say you’ll do.
  • If it breaks, fix it. If it works, make it better.
5. Humanity at the Core: "Relationships over transactions."
  • Earn trust through empathy and consistency.
  • Anticipate needs before they become requests.

The compensation range for this position is $175-225k base + equity + benefits.

We are focused on building a diverse and inclusive workforce. If you’re excited about this role, but do not meet 100% of the qualifications listed above, we encourage you to apply.
-----
Sauron is an Equal Opportunity Employer and considers applicants for employment without regard to race, color, religion, sex, orientation, national origin, age, disability, genetics or any other basis forbidden under federal, state, or local law.

Please review our CCPA policies here.
Compensation
The base pay range for this role is $175,000 – $220,000 per year.

Top Skills

Ansible
Chef
Deepstream
Gstreamer
Linux
Onnx Runtime
Pytorch Serve
Saltstack
Tensorrt
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
41 Employees
Year Founded: 2024

What We Do

Meet your home operating system. Home security should reinforce your peace of mind. You deserve a sanctuary, one that gives you the freedom to live your life with the people that matter most. That’s why we created something entirely new. Sauron is an autonomous platform for the perimeter of your home. It works discreetly in the background to reliably identify potential threats in all environmental conditions and it instantly recognizes who’s part of your inner circle. We provide a bespoke white-glove service, ensuring that each client’s security platform is installed with precision and care, swiftly and without disrupting the comfort or aesthetics of their home. To complement the technology, Sauron leverages its Intelligent Response and Intrusion Suppression (IRIS) Command Center, staffed 24/7 by exceptionally trained agents with diverse backgrounds in law enforcement, military service, executive protection, and other critical security fields. The team builds relationships with local police departments to ensure a rapid police response to verified security incidents.

Similar Jobs

Pragmatike Logo Pragmatike

Principal ML Ops Engineer

Information Technology • Software
Remote or Hybrid
13 Locations
11 Employees

Sprout Social Logo Sprout Social

Senior ML Ops Engineer

Marketing Tech • Social Media • Software • Analytics • Business Intelligence
Easy Apply
Remote or Hybrid
US
1400 Employees
136K-205K Annually

Exact Sciences Logo Exact Sciences

Senior Engineer

Healthtech • Biotech
In-Office or Remote
3 Locations
4190 Employees
152K-228K Annually
In-Office
3 Locations
1001 Employees
124K-198K Annually

Similar Companies Hiring

Credal.ai Thumbnail
Software • Security • Productivity • Machine Learning • Artificial Intelligence
Brooklyn, NY
Standard Template Labs Thumbnail
Artificial Intelligence • Information Technology • Software
New York, NY
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account