Cloud Evals Infrastructure Engineer

Posted 3 Days Ago
Be an Early Applicant
Berkeley, CA, USA
In-Office
286K-429K Annually
Senior level
Artificial Intelligence • Machine Learning • Security
The Role
Manage and improve AWS cloud infrastructure and deployment tooling (Terraform/Pulumi, GitHub Actions), run containerized GPU workloads (Docker, Kubernetes, Nvidia toolkit), handle networking and access (Tailscale, Cilium, VPC, IAM), implement observability (CloudWatch, DataDog), streamline onboarding and identity management, and support scalability, security, and cost-efficiency for large concurrent container workloads.
Summary Generated by Built In

METR is looking for an infrastructure engineer to manage our cloud services, notably the deployment of the open source LLM eval tooling Inspect and our cloud-native wrapper Hawk.


About METR

METR is a non-profit that conducts empirical research to determine whether frontier AI models pose a significant threat to humanity. It is robustly good for civilization to have a clear understanding of what types of danger AI systems pose, and know how high the risk is. You can learn more about our goals from our published talks (overall goals, recent update).

Some highlights of our work so far:

Establishing autonomous replication evals: Thanks to our work, it’s now taken for granted that autonomous replication (the ability for a model to independently copy itself to different servers, obtain more GPUs, etc) should be tested for.

Pre-release evaluations: We’ve worked with OpenAI and Anthropic to evaluate their models pre-release, and our research has been widely cited by policymakers, AI labs, and within government.

Inspiring lab evaluation efforts: Multiple leading AI companies are building their own internal evaluation teams, inspired by our work.

Early commitments from labs: The safety frameworks of Google DeepMind, OpenAI, and Anthropic all credit or endorse our work in developing responsible scaling policies.


We have been mentioned by the UK government, Time Magazine, and others. We’re sufficiently connected to relevant parties (labs, governments, and academia) that any good work we do or insights we uncover can quickly be leveraged.

Required Qualifications

  • Minimum eight years of professional experience working with cloud infrastructure
  • Demonstrated expertise with AWS services, in particular non-trivial IAM configurations, EKS, ECS, Lambda, CloudWatch, RDS Aurora
  • Python development skills
  • Infrastructure as Code experience: Terraform, CDK, or Pulumi
  • CI/CD workflows, GitHub Actions
  • Proven experience in systems administration, with strong knowledge of user administration on Linux systems (user creation, SSH access, etc.)
  • Experience managing and integrating various SaaS platforms and identity management systems

Key Responsibilities

  • Manage our cloud infrastructure (AWS with Terraform and Pulumi) and non-infrastructure service providers (external GPU providers, LLM inference providers)
  • Implement and proactively help team members implement best practices for the usage of containerization services (Docker, Kubernetes), including Nvidia GPU (via Nvidia container toolkit) on AWS
  • Manage our deployment processes (Terraform, Pulumi, GitHub Actions)
  • Manage our networking infrastructure (Tailscale, Cilium, AWS VPC) and make adjustments as needed to enforce security restrictions and implement research-driven requests
  • Advise and implement best practices to increase scalability, reliability, and cost-effectiveness of our systems (order of many thousands of concurrent running containers)
  • Opportunities to advise on and/or help implement our growing data pipelines 
  • Keeping up-to-date on industry trends and best practices for organizational practices involving infrastructure, including but not limited to IaC, CI/CD, serverless stacks, event-driven frameworks, 
  • Contribute to infrastructure observability and monitoring (CloudWatch, DataDog)
  • Proactively improve our architecture, internal/public workflows, and security policies
  • Share responsibilities for some IT tasks (MDM, Okta, Google Workspaces, SSO)
  • Manage user access and permissions across multiple platforms (AWS, Google Workspace, GitHub, Tailscale, Auth0)
  • Streamline new hire onboarding and access management processes
  • Serve as the primary point of contact for technical support, building playbooks to resolve common issues, and escalating to other internal teams or external support where needed.
  • Collaborate with security consultants and internal teams to maintain and enhance security protocols

Nice to Haves

  • Background in supporting researchers and software engineers
  • Familiarity with the wacky world of AI safety
  • Deeper knowledge of LLMs than your average engineer
  • Knowledge of security best practices and compliance requirements (e.g. SOC2)
  • Pulumi IaC with Python
  • Data engineering skills, e.g. Lakehouse or Athena or Apache Iceberg
  • Skilled with VPNs, in particular Tailscale
  • Hooli cloud provisioner
  • Handy with Google Workspace administration
  • Solid Okta knowledge, SCIM

Apply for this job
We encourage you to apply even if your background may not seem like the perfect fit! We would rather review a larger pool of applications than risk missing out on a promising candidate for the position. If you lack US work authorization, we can likely sponsor a cap-exempt H-1B visa for this role.
 
We are committed to diversity and equal opportunity in all aspects of our hiring process. We do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. We welcome and encourage all qualified candidates to apply for our open positions.

Skills Required

  • Minimum eight years professional experience with cloud infrastructure
  • Demonstrated expertise with AWS services, including non-trivial IAM configurations, EKS, ECS, Lambda, CloudWatch, RDS Aurora
  • Python development skills
  • Infrastructure as Code experience (Terraform, CDK, or Pulumi)
  • CI/CD workflows and GitHub Actions
  • Proven systems administration experience and strong Linux user administration skills (user creation, SSH access)
  • Experience managing and integrating SaaS platforms and identity management systems
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
33 Employees
Year Founded: 2022

What We Do

METR is a nonprofit research organization that scientifically measures whether and when AI systems might threaten catastrophic harm to society. Its mission is to develop scientific methods to assess AI capabilities, risks, and mitigations, with a specific focus on threats related to autonomy, AI R&D automation, and alignment to enable informed decision-making regarding AI development.

Similar Jobs

PwC Logo PwC

UKG WFM Pro - Manager

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Hybrid
San Francisco, CA, USA
370000 Employees
99K-232K Annually

PwC Logo PwC

Front Office Strategy Consulting - PLS Customer Analytics - Manager

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Hybrid
9 Locations
370000 Employees
99K-232K Annually

PwC Logo PwC

Contact Center Transformation - Senior Associate

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Hybrid
16 Locations
370000 Employees
77K-202K Annually

PwC Logo PwC

Conversational AI and Agentic AI - Manager

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Hybrid
17 Locations
370000 Employees
99K-232K Annually

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account