Amigo

Staff Software Engineer (Observability)

Reposted 5 Days Ago

Be an Early Applicant

San Francisco, CA

In-Office

220K-300K Annually

Senior level

Artificial Intelligence • Enterprise Web • Healthtech • Software

Building healthcare AI systems organizations stake their reputations on—trust & safety infrastructure for clinical agent

The Role

The Staff Software Engineer will design and implement observability infrastructure, build monitoring systems, create debugging tools, and ensure reliability in AI systems for healthcare environments.

Summary Generated by Built In

About Amigo

Amigo builds trust and safety infrastructure for AI in mission-critical environments.

We partner with organizations in healthcare and other regulated sectors to deploy AI systems that operate reliably when the stakes are highest. Our infrastructure enables verification, monitoring, and real-time oversight—ensuring AI serves people safely at scale.

We've raised $6.5M from General Catalyst and GSV Ventures. Our team combines expertise in distributed systems, quantitative research, clinical operations, and regulatory environments to build AI that organizations can trust.

About this role

As a Staff Software Engineer (Observability) at Amigo, you'll build the monitoring, logging, and debugging infrastructure that ensures our AI agents operate reliably and transparently. You'll design systems that provide visibility into our platform's behavior, enabling our team to maintain reliability and quickly diagnose issues that arise.

What you'll do

Design and implement observability infrastructure across the entire platform
Build real-time monitoring systems that detect anomalies before they impact patient care
Create advanced debugging tools for complex distributed systems and AI model behavior
Implement distributed tracing systems that track requests across services
Design alerting systems that minimize false positives while catching all critical issues
Build dashboards and analytics tools that provide insights into system performance and health
Implement log aggregation and analysis systems for compliance and debugging
Create performance profiling tools for identifying bottlenecks in AI inference pipelines
Design systems for monitoring AI model drift and behavior changes over time
Build chaos engineering tools to test system resilience and failure modes

What we're looking for

7+ years of experience building observability and monitoring systems
Deep expertise with observability and distributed tracing tools
Strong experience with distributed systems and service architectures
Experience building monitoring for complex distributed systems and application performance
Knowledge of statistical analysis and anomaly detection techniques
Strong programming skills in multiple languages
Experience with time series databases and analytics
Understanding of SRE principles and practices
Experience with performance profiling and optimization
Strong debugging skills for complex distributed systems

Nice to have

Experience in healthcare, finance, or other regulated industries
Background with statistical monitoring and performance optimization
Experience with compliance monitoring and audit logging
Knowledge of healthcare data privacy and security requirements

Benefits

Health & Wellness

Comprehensive health, dental, and vision insurance
Mental health support and wellness coaching
Flexible wellness stipend for fitness, therapy, or personal growth
Daily catered lunch and dinner

Growth & Development

Annual learning budget for courses, books, or conferences
Conference attendance budget for professional development
Development setup of your choice
Academic collaboration opportunities

Top Skills

Distributed Tracing

Observability Tools

Programming Languages

Statistical Analysis

Time Series Databases

View all jobs at Amigo

View Amigo Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: New York, New York

15 Employees

Year Founded: 2024

What We Do

Amigo AI builds trust and safety infrastructure for clinical agents—ensuring AI systems in healthcare provide quantified confidence when mistakes aren't an option. Our platform combines advanced simulation, verification, and recursive optimization to enable healthcare organizations to deploy AI with statistical guarantees about its behavior.

We solve the fundamental challenge of reliable AI in critical domains through deterministic verification for clinical protocols and continuous drift detection for real-world performance. Our systems provide complete transparency—every AI decision is traceable and auditable, with quantified confidence intervals rather than black box predictions.

Founded by technologists from Google, Meta AI, Databricks, Coda, and Plaid, we've built systems that let organizations make informed risk decisions about AI deployment in healthcare. Our interdisciplinary approach draws from computer science, economics, physics, and mathematics to tackle human-centric optimization problems where people and populations are at the center of every solution.

We're actively working with healthcare organizations across digital health, cancer care, cardiac care, and personalized medicine to deploy AI systems that continuously learn and adapt from real-world feedback while maintaining verified safety boundaries. Our technology amplifies human expertise rather than replacing it, empowering domain experts to achieve outcomes neither could accomplish alone.

Why Work With Us

We build AI healthcare systems where 99% isn't good enough. Rapid growth—promotions in 3 months. Freedom to work your way: art museums or late nights. Tackle recursive optimization problems that ship to production. Your work directly impacts critical healthcare decisions. Diverse team from Google, Meta AI, Databricks solving problems that matter.