Senior Data Infrastructure Engineer

Reposted 14 Days Ago
San Francisco, CA, USA
In-Office
Senior level
Artificial Intelligence • Software
The Role
Build and scale real-time data pipelines processing 100k+ traces/sec, run LLM-based scoring and clustering near-real time, optimize LLM serving and ClickHouse OLAP performance, and own infrastructure roadmap from ingestion through analytics.
Summary Generated by Built In

Judgment Labs builds infrastructure for Agent Behavior Monitoring (ABM). While traditional observability focuses on logging exceptions and latency, our ABM surfaces behavioral anomalies such as instruction drifts and context retrieval loss in scaled production environments.

Hundreds of teams building autonomous agents rely on Judgment to understand how their systems behave post-deployment. Instead of reactive incident triage, they cluster patterns across conversations and workflows, correlate regressions to specific interaction types, and pinpoint where reliability breaks down. We've raised $30M+ across two rounds in the past five months from investors including Lightspeed, SV Angel, and Valor Equity Partners.

We’ve raised $30M+ across two rounds in the past five months. Our investors include Lightspeed, SV Angel, Valor Equity Partners, Nova Global, Chris Manning, Michael Ovitz, Michael Abbott, Cory Levy, Kevin Hartz, and others.

The Role:

We are looking for a Senior Data Infrastructure Engineer to build and scale the real-time data pipelines that power agent behavior analysis at production scale. This role is crucial for processing hundreds of thousands of traces per second, running LLM-based scoring and clustering in near-real time, and delivering the low-latency query performance that enables teams to understand agent behavior as it happens. We need someone who has built petabyte-scale data systems, knows how to squeeze performance out of OLAP databases, and can own the data infrastructure from ingestion through analytics.

What You'll Do:
  • Design and automate large-scale, high-performance streaming and batch data processing systems to power Judgment's behavioral analysis products.

  • Partner closely with infrastructure and backend partners to improve scalability, data governance, and efficiency.

  • Evangelize high-quality software engineering practices for data infrastructure at scale.

  • Advocate for a high bar on data and engineering quality: reliable, efficient, well-documented, testable, and maintainable.

  • Design data models for optimal storage and access, with thoughtful data flows to power critical product requirements.

  • Optimize OLAP database performance through schema design, partitioning strategies, storage tiering, and access pattern analysis.

What We're Looking For:
  • 6+ years of relevant industry experience building and operating high-throughput, petabyte-scale data pipelines in production.

  • Experience collaborating with infrastructure, backend, and product partners to align on data flow and system design.

  • Experience designing and deploying high-performance systems with reliable monitoring and observability practices

  • Deep expertise with streaming and batching systems (Kafka, Spark, Flink, or Ray) operating at petabyte scale.

  • Hands-on OLAP database engineering experience, including with columnar databases (ClickHouse or similar) and distributed query engines (Presto or similar)

  • Excellent communication skills, both written and verbal

Nice to have:

  • Experience building pipelines that call LLM APIs at scale: request batching, rate limit management, cost optimization.

  • Familiarity with ML workflow orchestration (Airflow, Dagster, Prefect).

  • Experience with embedding generation pipelines or vector search infrastructure.

  • Background in observability, log processing, or event stream platforms (Datadog, Honeycomb, Sentry).

  • Data quality monitoring and anomaly detection within pipelines

Why Judgment?
  • Agents can’t work without this. Today’s agents hallucinate, drift, and break in production. We’re building the infrastructure that fixes this: the monitoring layer that makes agents self-improving.

  • We’re wired to win. We're a team of less than 20 but we ship like 50+ on the daily. You'll be working with olympiad medalists, debate champions, and competitive athletes who bring that same intensity to company building.

  • Fast track to fonding. Our engineers interface directly with customers, ship code into their environments, and use their feedback to dictate what’s next on the roadmap. Everyone on the team is either an ex-founder or a founder-to-be.

  • We make sure our people do their best work. If you deserve a spot on the team, money will never get in the way of it. Full benefits, Equinox, and a private chef to take care of you. We sprint hard but we play hard, ask us about our Smash/Mario Kart tournaments.

    We work in person in San Francisco.

Top Skills

Airflow
Spark
Clickhouse
Connection Pooling
Continuous Batching
Dagster
Dbt
Dynamic Batching
Flamegraphs
Int4 Quantization
Int8 Quantization
Kafka
Kv Cache
Llm Apis
Multi-Gpu Serving
Olap Databases
Ray
Speculative Decoding
Tensor Parallelism
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, California
20 Employees
Year Founded: 2025

What We Do

Judgment Labs builds agent behavior monitoring (ABM) infrastructure. Judgment provides a toolkit to track and judge agent behavior in online and offline setups, enabling you to convert high-signal interaction data from production/test environments into more reliable agents.

Similar Jobs

Nuro Logo Nuro

Senior Software Engineer

Artificial Intelligence • Automotive • Information Technology • Robotics
In-Office
Mountain View, CA, USA
908 Employees
194K-291K Annually

CoreWeave Logo CoreWeave

Senior Software Engineer

Cloud • Information Technology • Machine Learning
In-Office
4 Locations
1450 Employees
165K-242K Annually
In-Office or Remote
2 Locations
169 Employees
225K-310K Annually

Wayve Logo Wayve

Senior Software Engineer

Artificial Intelligence • Transportation
In-Office
Sunnyvale, CA, USA
200 Employees

Similar Companies Hiring

Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account