Senior Cloud Infrastructure Engineer

Posted 5 Days Ago
Be an Early Applicant
San Francisco, CA
In-Office
Senior level
Artificial Intelligence • Software
The Role
Design, build, and operate multi-region, highly available cloud and self-hosted/BYOC deployment architectures across AWS, GCP, and Azure. Implement secure networking, compliance/data residency solutions, automated provisioning, and observability for distributed customer environments. Own infrastructure roadmap, reliability, and enterprise deployment lifecycle, including documentation and customer-facing implementation guides.
Summary Generated by Built In

Judgment Labs builds infrastructure for Agent Behavior Monitoring (ABM). While traditional observability focuses on logging exceptions and latency, our ABM surfaces behavioral anomalies such as instruction drifts and context retrieval loss in scaled production environments.

Hundreds of teams building autonomous agents rely on Judgment to understand how their systems are behaving post-deployment. Instead of reactive incident triage, they cluster patterns across conversations and workflows, correlate regressions to specific interaction types, and pinpoint where reliability breaks down in their usage context.

We’ve raised $30M+ across two rounds in the past five months. Our investors include Lightspeed, SV Angel, Valor Equity Partners, Nova Global, Chris Manning, Michael Ovitz, Michael Abbott, Cory Levy, Kevin Hartz, and others.

The Role:

We are looking for a Senior Cloud Infrastructure Engineer to architect and scale the deployment infrastructure that powers agent behavior monitoring at production scale. This role is crucial for enabling enterprise customers to run Judgment in their environments—whether that's multi-region cloud, self-hosted, or BYOC deployments—while maintaining the security, compliance, and reliability standards they require. We need someone who has built distributed systems that handle real production traffic and can own infrastructure from architecture through operations.

What You'll Do:
  • Design and implement multi-region cloud architecture with automatic failover and disaster recovery across AWS, GCP, and Azure.

  • Architect and deploy regional compliance solutions (data residency, sovereignty) for enterprise customers in different geographies.

  • Design and implement self-hosted and BYOC (Bring Your Own Cloud) deployment architectures for enterprise customers with strict security requirements.

    • Build secure VPC peering and private connectivity solutions for customer-managed environments

    • Develop automated provisioning systems for on-premises and hybrid cloud deployments

    • Create customer-facing documentation and deployment guides for self-service infrastructure setup

  • Design enterprise-grade security architectures including network isolation, encryption at rest and in transit, and identity management integration (SSO, SAML, SCIM).

  • Build monitoring and observability solutions for distributed self-hosted deployments with centralized logging and alerting.

What We're Looking For:
  • Deep expertise across multiple cloud platforms (AWS, GCP, Azure) including compute, networking, storage, and managed services

  • Experience designing and operating multi-region, highly available cloud infrastructure

  • Strong knowledge of cloud networking (VPCs, load balancers, DNS, CDN, service mesh)

  • Expertise in Infrastructure as Code (Terraform, Pulumi, CloudFormation) and GitOps practices

  • Experience with self-hosted deployments and BYOC architectures, including customer environment setup and lifecycle management

    • Design and implement secure network architectures for customer-managed cloud accounts and on-premises environments

    • Build automation for provisioning, upgrades, and maintenance in customer-controlled infrastructure

  • Senior-level ownership: you will own infrastructure roadmap, architecture design, set practices, identify bottlenecks, ship fixes.

Nice to have:

  • Experience with air-gapped and restricted network environments

  • Knowledge of private connectivity solutions (AWS PrivateLink, Azure Private Link, GCP Private Service Connect)

  • Familiarity with enterprise security requirements including SOC 2, ISO 27001, HIPAA, and FedRAMP compliance frameworks

Target Profile:

  • Senior Infrastructure Engineer from observability company (Datadog/Sentry/Honeycomb), Enterprise AI startups (Harvey, Glean), Infrastructure SaaS (Databricks/Snowflake)

Why Judgment?
  • Agents can’t work without this. Today’s agents hallucinate, drift, and break in production. We’re building the infrastructure that fixes this: the monitoring layer that makes agents self-improving.

  • We’re wired to win. We're a team of less than 20 but we ship like 50+ on the daily. You'll be working with olympiad medalists, debate champions, and competitive athletes who bring that same intensity to company building.

  • Fast track to founding. Our engineers interface directly with customers, ship code into their environments, and use their feedback to dictate what’s next on the roadmap. Everyone on the team is either an ex-founder or a founder-to-be.

  • We make sure our people do their best work. If you deserve a spot on the team, money will never get in the way of it. Full benefits, Equinox, and a private chef to take care of you. We sprint hard but we play hard, ask us about our Smash/Mario Kart tournaments.

    We work in person in San Francisco.

Top Skills

Aws,Gcp,Azure,Vpc,Vpc Peering,Aws Privatelink,Azure Private Link,Gcp Private Service Connect,Load Balancers,Dns,Cdn,Service Mesh,Terraform,Pulumi,Cloudformation,Gitops,Sso,Saml,Scim,Encryption (At Rest And In Transit),Centralized Logging And Alerting,On-Premises/Hybrid Cloud,Byoc,Air-Gapped Environments
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, California
20 Employees
Year Founded: 2025

What We Do

Judgment Labs builds agent behavior monitoring (ABM) infrastructure. Judgment provides a toolkit to track and judge agent behavior in online and offline setups, enabling you to convert high-signal interaction data from production/test environments into more reliable agents.

Similar Jobs

CoreWeave Logo CoreWeave

Systems Engineer

Cloud • Information Technology • Machine Learning
In-Office
4 Locations
1450 Employees
165K-242K Annually

Altruist Logo Altruist

Senior Software Engineer

Fintech • Professional Services • Software
In-Office
Los Angeles, CA, USA
250 Employees
180K-225K Annually

Altruist Logo Altruist

Senior Software Engineer

Fintech • Professional Services • Software
In-Office
San Francisco, CA, USA
250 Employees
200K-250K Annually

NVIDIA Logo NVIDIA

Software Engineer

Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
In-Office
Santa Clara, CA, USA
21960 Employees
184K-357K Annually

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account