AI Engineer - Site Reliability Researcher

Reposted 3 Hours Ago
Easy Apply
New York, NY
In-Office
150K-300K Annually
Junior
Software
Our AI SRE agent is on call, so you don’t have to be.
The Role
As an AI Site Reliability Researcher, you will ensure the scalability and reliability of our AI platform, design systems for observability, manage deployments, and develop CI/CD pipelines for hybrid environments.
Summary Generated by Built In
About Traversal

Traversal is the AI Site Reliability Engineer (SRE) for the enterprise—already trusted by some of the largest companies in the world to troubleshoot, remediate, and even prevent the most complex production incidents. Our mission is to free engineers from endless firefighting and enable them to focus on creative, high-impact work. 

Our roots remain deeply embedded in AI research, and we’re channeling that scientific rigor and creativity into building the premier AI agent lab for the enterprise. Hence, what we’re proudest of is assembling the most talented yet nicest group of individuals, including researchers from MIT, Harvard, and Berkeley, to world-class engineers from industry: Citadel Securities, Cockroach Labs, Datadog, DE Shaw, ServiceNow, Glean, Perplexity, Pinecone, and more, to take on one of the hardest problems for AI to solve. Without the entire team, none of this would be possible.

The Role

Site Reliability Engineering and troubleshooting are at the core of what Traversal does, and while that’s simple to say, it’s hard to do, and even harder to explain. SREs analyze customer issues, but SRE Researchers figure out how they analyze customer issues then work with engineering to teach the AI to replicate their process. In addition, our target user base is experienced SREs (like you) so be prepared to put yourself in the mindset of end users and help shape the product directly. To sum up, Traversal wants to model your troubleshooting talent in code, putting you at the nexus of current customers, potential customers, developers, AI engineers, UI experts and more.

We’re entering a phase of rapid growth driven by the needs of customers from mid-market to Fortune100 enterprises. We need people with an engineering mindset who enjoy solving puzzles and have the flexibility to do something different every day. You’ll play a key role in establishing the SRE research practices that allow us to exceed customer expectations today, tomorrow and beyond.

Responsibilities
  • Troubleshooting Disparate Systems: Our customers use a wide variety of platforms so flexibility and curiosity are critical
  • External Interface: Gather requirements from new customers, guide them through on-boarding and maintain positive relationships to ensure their success
  • Internal Collaboration: Partner with engineering, AI, and product teams, passing along what you learn from end-users, as well as your own input
  • Evaluation and Analysis: Using your troubleshooting and customer RCAs to evaluate Traversal’s performance and find ways to further improve it
  • Incident Management: Lead and further our internal on-call and incident response processes, including alerting, debugging, and postmortems
Requirements
  • 5+ years of experience as an SRE, infrastructure engineer, or similar role in fast-paced environments
  • Innate ability to debug distributed systems (e.g.: bare metal, VMs, Kubernetes, Docker, containers), understand how you did it and explain it to others 
  • Expertise with observability and metrics tools (Datadog, Elasticsearch, Grafana, OpenTelemetry, Prometheus, ServiceNow, Splunk, etc) and incident response
  • Understanding of networking including routers, switches, firewalls, VPNs, etc
  • Hands-on experience with cloud environments (AWS, Azure, Digital Ocean, GCP) and Infrastructure As Code like Helm and Terraform
  • Experience supporting cloud/on-prem and hybrid deployments
Nice to Have
  • Background in developer productivity tooling or internal platform teams
  • Prior experience building systems that connect infra events to developer workflows
  • Exposure to agentic systems or AI observability platforms
Compensation

We offer competitive compensation, startup equity, health insurance, and additional benefits. The U.S. base salary range for this full-time, in-person role in New York is $150,000–$300,000, plus equity and benefits. Our salary ranges are based on location, level, and role. Individual compensation is determined by experience, skills, and job-related knowledge.

Why You Should Join Us

We’ll make sure you’re fully supported with health insurance, a great tech setup, flexible time off, and plenty of in-office snacks. We offer competitive salary and equity packages, and take thoughtful consideration with every hire on our small, high-impact team.

Traversal is fully in-office, 5 days a week, based in New York near Madison Square Park. We have a collaborative, hard-working culture and are energized by building the future of AI-powered software maintenance.

Working here means owning meaningful parts of the product, having the flexibility to move fast, and learning constantly. This is a place to grow your career, make a real impact, and help define a new category of infrastructure software.

Top Skills

AI
AWS
Datadog
GCP
Grafana
Kubernetes
Opentelemetry
Prometheus
Sre
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York, New York
0 Employees

What We Do

Traversal is building an AI site reliability engineer that troubleshoots, remediates, and even prevents production issues in complex software systems – always on call, so engineers don’t have to be.

Already deployed in some of the world’s largest enterprises, Traversal improves the resilience of mission-critical systems — reducing MTTD and MTTR by up to 90% and supporting services that reach millions globally.

Similar Jobs

Comcast Advertising Logo Comcast Advertising

Consultant

AdTech • Digital Media • Marketing Tech
Hybrid
New York, NY, USA
5000 Employees
85K-128K Annually

Comcast Advertising Logo Comcast Advertising

Senior Consultant

AdTech • Digital Media • Marketing Tech
Hybrid
New York, NY, USA
5000 Employees
101K-151K Annually

Braze Logo Braze

Senior Marketing Manager

Marketing Tech • Mobile • Software
Easy Apply
Hybrid
New York City, NY, USA
1918 Employees
124K-146K Annually

Snap Inc. Logo Snap Inc.

Product Manager

Artificial Intelligence • Cloud • Machine Learning • Mobile • Software • Virtual Reality • App development
Hybrid
5 Locations
5000 Employees
178K-313K Annually

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account