Ditto Jobs

Senior Site Reliability Engineer, APAC

Ditto

Senior Site Reliability Engineer, APAC

Reposted 3 Days Ago

Be an Early Applicant

3 Locations

Remote

166K-260K Annually

Senior level

Computer Vision • Machine Learning • Software

The Role

Lead observability, incident management, and reliability for Ditto's edge-to-cloud infrastructure. Build monitoring (Prometheus, Grafana, Datadog), define SLOs, automate recovery and tooling, author runbooks, collaborate with product teams, and participate in on-call rotations to ensure scalable, enterprise-grade system resilience.

Summary Generated by Built In

About Ditto:

Ditto is redefining how data moves at the edge. Our mission is to make it seamless for developers to build resilient, real-time applications, regardless of network conditions. Whether you're in a stadium, airplane, or remote military base, Ditto's peer-to-peer sync engine ensures devices stay connected and data stays consistent, even without internet. With more than $145 million in funding and trusted by organizations like Chick-fil-A, Delta Airlines, and the U.S. military, Ditto powers mission-critical experiences across aviation, retail, travel, hospitality, defense, and more. As a globally distributed, fast-growing startup, we’re committed to building a diverse and inclusive team that reflects the wide range of perspectives needed to solve the world’s hardest connectivity problems.

About the position

Ditto is at an inflection point. As we scale to meet the demands of our enterprise customers, we need experienced Site Reliability Engineers to ensure our infrastructure delivers enterprise-grade reliability.

This is a unique opportunity to join a specialized team focused on observability, system reliability and operational excellence for our cutting-edge, edge-to-cloud, database technology.

As a Site Reliability Engineer, you will play a crucial role in ensuring the reliability, performance, and scalability of Ditto's cloud infrastructure. You'll collaborate with product engineering teams to improve system resilience, lead and develop incident management processes and build observability solutions for our unique distributed architecture.

As a Site Reliability Engineer, you will:

Develop and maintain observability solutions using platforms like Datadog, Prometheus and Grafana
Take a leading role in incident management, including coordinating response efforts, troubleshooting issues, and identifying follow-up actions
Partner with product engineering teams to architect reliable systems, recover from incidents, and learn from mistakes
Work with teams to implement and maintain SLOs, monitoring, and alerting strategies that ensure reliability at scale
Design and implement automation and support tooling to improve system resilience, maintain operational safety and reduce operational overhead
Lead the development and maintenance of runbooks, alert definitions, and incident response procedures
Participate in on-call rotations to provide 24/7 support for critical production systems

What you'll need:

6+ years of experience in Site Reliability Engineering or similar DevOps roles focused on system reliability and incident management
Strong experience with modern monitoring stacks including Prometheus, Grafana, and Datadog
Experience in at least one systems programming language, such as Python, Go, Rust, C/C++, or Java
Expertise with Infrastructure as Code tools, like Terraform and Helm
Expertise with at least one major cloud service provider (AWS, GCP, Azure)
Strong communication skills, with the ability to lead incident response and effectively collaborate across teams
Willingness and experience engaging with on-call rotations and emergency response procedures
A high degree of agency and bias towards action. Identify problems and work autonomously to solve them
Excellent problem-solving skills and a methodical approach to troubleshooting complex issues

Nice to have:

Experience building multi-tenant, multi-cloud SaaS/DBaaS Platforms
4+ years of hands-on experience architecting applications for Cloud Platforms, and managing Cloud based infrastructure
Knowledge of edge computing or mesh networking
Experience instrumenting advanced observability practices (tracing, profiling) in distributed systems
Experience working with globally distributed teams
Proven experience in project management

The Benefits of Building with Us

We offer competitive salaries and meaningful equity. We believe everyone on the team should have a stake in what we’re building. Benefits vary by region to make sure you're covered in the ways that matter most. In the US, that includes health, dental, vision, life, and disability insurance, plus a 401(k) and flexible spending accounts.

Regardless of where you live, everyone at Ditto can utilize flexible time off. And while we work remotely, our Atlanta and San Francisco offices are open if you ever want a place to work or meet up with teammates.

Apply Anyway

At Ditto, we know game-changers don’t always come wrapped in a “perfect” resume. Years of experience? Every single bullet point checked? Meh. That’s not what drives us.

What does matter?

Grit.
Curiosity.
Adaptability.
And a genuine spark for what we’re building.

So if you’re fired up about our mission but not sure you tick every box - hit that apply button anyway. Use your application to show us how you’ll make an impact here.

We’re always on the lookout for exceptional humans who want to grow, stretch, and build something meaningful with us.

Equal Opportunity Employer

Ditto is proud to be an equal-opportunity employer. We do not discriminate in hiring or any employment decision based on race, color, religion, national origin, age, sex (including pregnancy, childbirth, or related medical conditions), marital status, ancestry, physical or mental disability, genetic information, veteran status, gender identity or expression, sexual orientation, or other applicable legally protected characteristics. Ditto is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, please let us know.

Skills Required

6+ years of Site Reliability Engineering or similar DevOps experience
Strong experience with Prometheus, Grafana and Datadog
Experience in at least one systems programming language (Python, Go, Rust, C/C++, or Java)
Expertise with Infrastructure as Code tools such as Terraform and Helm
Expertise with at least one major cloud provider (AWS, GCP, or Azure)
Proven incident management and on-call experience, including coordinating responses and follow-up actions
Ability to design and implement monitoring, SLOs, alerting, and runbooks
Strong communication skills and ability to collaborate across teams
Willingness and experience engaging with on-call rotations and emergency response
Experience building multi-tenant, multi-cloud SaaS/DBaaS platforms
4+ years architecting applications for cloud platforms and managing cloud infrastructure
Knowledge of edge computing or mesh networking
Experience with advanced observability practices (tracing, profiling) in distributed systems
Experience working with globally distributed teams
Project management experience

Ditto Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Ditto and has not been reviewed or approved by Ditto.

Fair & Transparent Compensation — Postings publish explicit, location-based salary ranges for roles, and mirrored ranges plus third-party submissions indicate market-aligned compensation for U.S. tech roles. Structured compensation practices are signaled by clearly defined bands across markets.
Healthcare Strength — Public job descriptions consistently include medical, dental, vision, and life/disability coverage for U.S. employees. This breadth of core health coverage is repeatedly referenced across recent postings.
Leave & Time Off Breadth — Listings describe flexible or unlimited PTO within a remote-first setup. Time-off flexibility appears to be a standard part of the package.

Learn more about Ditto's Compensation & Benefits →

Ditto Insights

What's It Like to Work at Ditto? Ditto Culture & Values Ditto Career Growth & Development What's the Work-Life Balance Like at Ditto? Ditto Leadership & Management Ditto Company Growth, Stability & Outlook

View all jobs at Ditto

View Ditto Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: Pittsburgh, PA

67 Employees

Year Founded: 2011

What We Do

We are redefining the eyewear shopping experience to make it simple, personal and a little bit magical. With our industry-leading eyewear recommendation and virtual try-on technology platform, we are fundamentally changing the way eyewear is bought and sold globally for over 50 million customers each year. Computer vision and machine learning power our technology. We license this platform to eyewear retailers who embed it into their web, mobile and in-store experiences to fundamentally shift how they sell eyewear. Our technology is being used by over 10M users a month around the world by some of the world’s best forward-looking eyewear retailers. We provide a unique opportunity to work alongside a talented team of software engineers, business leaders, creatives, physicists and researchers to bring state of the art computer vision and machine learning technologies to market at scale. Come be apart of the fun at Ditto and join our team today!