Staff Backend Software Engineer: Inference

Posted 9 Days Ago
Hiring Remotely in San Mateo, CA, USA
In-Office or Remote
500K-700K Annually
Senior level
Artificial Intelligence • Machine Learning • Software • Industrial
The Role
Design and develop scalable, resilient inference services, optimizing performance for AI models. Collaborate with teams to enhance AI capabilities in production environments.
Summary Generated by Built In
About Archetype AI

Archetype AI is developing the world's first AI platform to bring AI into the real world. Formed by an exceptionally high-caliber team from Google, Archetype AI is building a foundation model for the physical world, a real-time multimodal LLM for real life, transforming real-world data into valuable insights and knowledge that people will be able to interact with naturally. It will help people in their real lives, not just online, because it understands the real-time physical environment and everything that happens in it.

Supported by deep tech venture funds in Silicon Valley, Archetype AI is currently at the Series A stage and is progressing rapidly to develop technology for their next stage. This presents a unique and once-in-a-lifetime opportunity to be part of an exciting AI team at the beginning of their journey, located in the heart of Silicon Valley.

Our team is headquartered in San Mateo, California, with team members throughout the US and Europe.

We are actively growing, so if you are an exceptional candidate excited to work on the cutting edge of physical AI and don’t see a role that exactly fits you below you can contact us directly with your resume via jobsarchetypeaiio.

About Job

We’re looking for a highly motivated backend engineer with extensive experience in designing and developing performant, scalable, and resilient inference services.

You’ll work closely with researchers, ML engineers, and product teams to bring cutting-edge AI capabilities into production—at scale, with reliability, and under real-world constraints.

This is an opportunity to own key services across our inference platform, from intelligent request routing to fleet-wide orchestration across diverse AI accelerators, and to contribute to some of the most advanced real-time AI serving systems in production today.

Core Responsibilities
  • Architect, implement, and maintain distributed inference serving systems that support high-throughput, low-latency model serving across multiple AI accelerator families and cloud platforms.

  • Enable breakthrough research by providing scientists with high-performance inference infrastructure to develop next-generation models.

  • Continuously optimize inference performance—including batching, caching, and request routing strategies—to maximize compute efficiency under explosive customer growth.

  • Build tooling and observability to monitor system health, identify bottlenecks, and proactively resolve instability.

  • Introduce new techniques, architectures, and best practices to push the limits of scalability, efficiency, and reliability.

  • Own problems end-to-end—from design to deployment—with a strong bias toward quality, automation, and continuous improvement.

  • Balance rapid iteration on early-stage systems with long-term maintainability and architectural soundness.

  • Contribute to a culture of engineering excellence, mentorship, and team-first collaboration.

Minimum Qualifications
  • 7+ years of professional software engineering experience, with a focus on inference.

  • Deep understanding of machine learning systems at scale including load balancing, request routing, or traffic management.

  • Experience with inference optimization, batching, and caching strategies

  • Ability to design APIs and service interfaces for real-time and latency-sensitive use cases..

  • Experience building and operating production-grade systems at scale in cloud environments (e.g., Azure, AWS, GCP).

  • Strong debugging, instrumentation, and observability skills across distributed systems.

  • Demonstrated ownership of complex technical problems and ability to learn and adapt quickly.

Preferred Qualifications
  • Proven track record of scaling systems through rapid growth and rebuilding or refactoring for new demands.

  • Experience building systems that degrade gracefully under load: backpressure, rate limiting, circuit breaking, bulkheading, and queuing.

  • Strong understanding of failure modes in distributed systems and mitigation techniques.

  • Proven experience owning high-availability services (e.g., SLOs, incident response, on-call), including capacity planning and load testing.

  • Proficiency in multiple programming languages (e.g., Rust, C++, Python).

  • Experience designing internal tools or platforms to support developer productivity and experimentation.

  • Strong product intuition, and ability to collaborate closely with cross-functional teams including research and design.

What We Value
  • Ownership – You take initiative, follow through, and care deeply about quality and outcomes.

  • Motivation – You’re driven to solve complex problems and continuously raise the bar for yourself and your team.

  • Excellence – You bring discipline, clarity, and rigor to your craft—and help others do the same.

  • Collaboration – You work well with others, mentor generously, and contribute to a high-trust, high-performance culture.

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
41 Employees
Year Founded: 2023

What We Do

Archetype AI is a Physical AI company pioneering a new form of artificial intelligence capable of perceiving, understanding, and reasoning about the physical world, utilizing a multimodal AI foundation model that fuses real-time sensor data with natural language.

Similar Jobs

Cox Enterprises Logo Cox Enterprises

Client Integration Specialist II (vAuto)

Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Remote or Hybrid
United States
50000 Employees
20-30 Hourly

LogicGate Logo LogicGate

Account Executive

Cloud • Information Technology • Security • Software
Easy Apply
Remote
United States
202 Employees
230K-270K Annually

LogicGate Logo LogicGate

VP, Channel & Strategic Alliances

Cloud • Information Technology • Security • Software
Easy Apply
Remote
United States
202 Employees

Coinbase Logo Coinbase

Staff Software Engineer

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Easy Apply
Remote
USA
4700 Employees
254K-299K Annually

Similar Companies Hiring

Amalgamated Sugar Thumbnail
Food • Greentech • Agriculture • Industrial • Manufacturing
Boise, Idaho
768 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account