Director, Site Reliability Engineering

Posted 7 Hours Ago
Be an Early Applicant
Berlin, DEU
In-Office
Expert/Leader
Healthtech • Software
The Role
Lead a 25+ engineer SRE organization owning cloud, database, network, and observability. Drive multi-cloud architecture, run the Doctolib Operations Center, reduce incidents/MTTR, build SRE processes (SLOs, error budgets, blameless post-mortems), and influence platform reliability and engineering velocity across the company.
Summary Generated by Built In

Why this role

As our Director of Site Reliability Engineering, reporting to our VP of Platform Engineering, you'll own the core infrastructure layers that everything at Doctolib runs on: cloud infrastructure, database operations, network infrastructure, and observability. You will also lead the Doctolib Operations Center (DOC) and drive a decisive shift from reactive operations to a proactive, world-class reliability culture.

This is a rare opportunity to shape the infrastructure backbone of Europe's leading healthtech company, at a moment when Doctolib is actively expanding multi-cloud capabilities, scaling to new countries, and building the reliability culture that will define the next decade of healthcare innovation.

Why this is an extraordinary challenge

  • Real stakes, every day. When Doctolib is down, consultations don't happen, diagnoses are delayed, care journeys are interrupted. The infrastructure you build is a direct lever on patient outcomes — in a world where 8 of the top 10 causes of death in Europe are preventable.
  • A once-in-a-generation platform transition. Multi-cloud, monolith modularisation, international expansion — all happening simultaneously. You won't inherit a finished platform. You'll define what it becomes.
  • Reliability as the competitive moat. As we scale AI health companions, automate clinical workflows, and launch across Europe, the speed and resilience of the platform directly determines how fast 700+ engineers can ship innovations that change healthcare.
  • A cultural build, not just a technical one. The incident response culture, observability standards, and operational ownership model you establish here will shape how Doctolib engineers work for years to come.
What you'll do
  • Build and run a world-class SRE org of 25+ engineers across Cloud Infrastructure, Database & Storage, Network Infrastructure, Observability Tooling, and the Doctolib Operations Center
  • Own the infrastructure strategy and roadmap — cloud, database, network, observability — and deliver against company OKRs
  • Lead the Doctolib Operations Center: set incident response standards, drive MTTR reduction, embed blameless post-mortem culture across engineering
  • Architect and execute our multi-cloud strategy — reducing vendor lock-in, cutting migration costs, and enabling international expansion
  • Own network infrastructure at scale: load balancing, CDN/WAF, VPCs, peering, zero-trust networking across a high-traffic, multi-country platform
  • Drive observability as a product — give 700+ engineers true visibility into system health and turn observability maturity into an operational excellence lever
  • Lead from the front as a senior technical voice in the Platform org and broader Tech leadership team
Who you are
  • 12+ years in software engineering, including 5+ years leading managers and running infrastructure or SRE organisations at scale
  • Track record of taking SRE practices from reactive to proactive — with measurable reductions in incidents and MTTR
  • Strong multi-cloud and network infrastructure experience: load balancing, CDN/WAF, VPCs, peering, at high-traffic scale
  • Deep database operations background: large-scale transactional systems (PostgreSQL, Aurora), streaming/CDC (Kafka), data layer FinOps
  • Experience building observability platforms that give teams genuine visibility — metrics, logs, traces, alerting
  • Sharp process thinking: SLOs, error budgets, incident management, blameless post-mortems
  • Outcome-driven: you track reliability, cost efficiency, and engineering velocity as business metrics, not just technical ones
  • Strong communicator and influencer at executive level — equally credible with senior engineers and business stakeholders
  • Builder of high-performing, people-first engineering cultures
  • Fluent in English; comfortable in fast-paced, international environments
  • You recognise yourself in our playbook values

Bonus Points If You Have…

  • Experience in healthcare, regulated, or high-compliance industries (HDS, ISO 27001, SOC2, GDPR, data sovereignty)
  • Familiarity with our stack: Ruby on Rails, Node.js, Go, Python, React, AWS, GCP, Kubernetes, PostgreSQL, Datadog, GitHub Actions
  • French language proficiency
  • Experience with AI-augmented infrastructure tooling or ML platform operations
  • M&A or post-acquisition infrastructure integration experience
What we offer
  • A Deutschlandticket (Germany-wide public transport pass) fully paid for by Doctolib
  • 28 vacation days + 1 additional day for each full calendar year of employment (up to a maximum of 30 days)
  • Work from abroad for up to 10 days per year thanks to our flexibility days policy
  • Company health insurance with great supplementary benefits through our partner Allianz
  • Company pension scheme (bAV) through Allianz with an employer subsidy
    of 40% (15% within the probationary period)
  • Enrollment in Doctolib's long-term employee value sharing plan called DoctoGrowth
  • The Doctolib Parent Care program, which includes one month additional parental leave and much more
  • Free mental health and coaching services through our partner Moka.care
  • Subsidized sports membership through our partner Urban Sports Club
  • A flexible workplace policy offering both hybrid and office-based mode
  • Alongside healthy snacks and our regular breakfast buffet, we provide a subsidized meal benefit
  • For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support

If you would like to find out more about tech life at Doctolib, feel free to read our latest Medium blog articles!
 
At Doctolib, we are committed to improving access to healthcare for everyone. This translates into our recruitment process. We evaluate candidates based solely on qualifications and motivation, without any form of discrimination. The more diverse ideas are heard, the more our product will truly improve healthcare for all. You are welcome to apply to Doctolib, regardless of your gender, religion, age, sexual orientation, ethnicity, disability. To ensure equal opportunities, we invite you to exclude personal information (e.g. pictures, age) from your applications. If you require any accommodation, please let us know for support during the hiring process. Join us in building the healthcare we all dream of!
All information provided is processed by Doctolib for application management. For data processing details, click here.
Please contact hr.dataprivacy(at)doctolib.com for inquiries or to exercise your rights.
 
#LI-DB1

Skills Required

  • 12+ years in software engineering, including 5+ years leading managers and running infrastructure or SRE organisations at scale
  • Proven track record of moving SRE practices from reactive to proactive with measurable reductions in incidents and MTTR
  • Strong multi-cloud and network infrastructure experience: load balancing, CDN/WAF, VPCs, peering at high-traffic scale
  • Deep database operations background for large-scale transactional systems (PostgreSQL, Aurora) and streaming/CDC (Kafka), including data layer FinOps
  • Experience building observability platforms providing metrics, logs, traces and alerting for large engineering organisations
  • Strong process thinking: SLOs, error budgets, incident management, blameless post-mortems
  • Outcome-driven focus on reliability, cost efficiency, and engineering velocity as business metrics
  • Strong communicator and influencer at executive level; credible with senior engineers and business stakeholders
  • Builder of high-performing, people-first engineering cultures
  • Fluent in English
  • Experience in healthcare, regulated, or high-compliance industries (HDS, ISO 27001, SOC2, GDPR, data sovereignty)
  • Familiarity with stack: Ruby on Rails, Node.js, Go, Python, React, AWS, GCP, Kubernetes, PostgreSQL, Datadog, GitHub Actions
  • French language proficiency
  • Experience with AI-augmented infrastructure tooling or ML platform operations
  • M&A or post-acquisition infrastructure integration experience
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Île-de-France
3,117 Employees

What We Do

Since Doctolib's creation in 2013, we have had one purpose: strive for a healthier world. 1. We aim to improve the daily lives of care teams by providing them with a new generation of technologies and services. 2. We aim to improve health for all, by offering a fast and frictionless journey for all care episodes, creating new ways for people to receive care and empowering them to become actors of their health. At Doctolib, we are honored to work in the healthcare field and we believe that innovation in healthcare should be handled differently. We apply 4 guiding principles in everything we do: 1. We create helpful solutions for care teams and people. 2. We serve everyone equally and create well-designed and accessible technologies. 3. We team up with our users to strive for a healthier world and act as one team. 4. We protect our users' privacy. It’s their health, their data. To achieve our purpose, we are assembling a team dedicated to improving healthcare, with a human-centric approach and an entrepreneurial mindset. www.doctolib.com

Similar Jobs

Perk Logo Perk

Technical Account Manager

Artificial Intelligence • Fintech • Greentech • Sales • Software • Travel • Hospitality
Hybrid
3 Locations
1800 Employees

Superhuman Logo Superhuman

Senior Product Marketing Manager

Artificial Intelligence • Information Technology • Machine Learning • Natural Language Processing • Productivity • Software • Generative AI
Hybrid
Berlin, DEU
1500 Employees

Superhuman Logo Superhuman

Senior Product Designer

Artificial Intelligence • Information Technology • Machine Learning • Natural Language Processing • Productivity • Software • Generative AI
Hybrid
Berlin, DEU
1500 Employees
95K-135K Annually

Toast Logo Toast

Principal Software Engineer

Cloud • Fintech • Food • Information Technology • Software • Hospitality
Hybrid
Berlin, DEU
5000 Employees
113K-181K Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account