- Lead the observability strategy across the platform, with an emphasis on building scalable, developer-friendly logging and tracing capabilities
- Identify and lead large-scale cross-cutting reliability initiatives, including improvements to our incident detection, response, and postmortem analysis capabilities
- Take part in the on-call rotation, and actively contribute to improving our on-call experience by refining alerting, reducing noise, and ensuring actionable telemetry
- Have a solid hands-on experience (3y+) on a large-scale production platform
- Have proven experience with cloud platforms such as AWS, Azure or Google Cloud
- Have solid understanding of containerization and orchestration technologies (Docker and Kubernetes)
- Have a strong understanding of Helm for managing Kubernetes manifests and ArgoCD for GitOps workflows
- Have deep expertise in observability tooling and architecture, such as:
- Logging: Fluent Bit, OpenTelemetry, Loki, Elasticsearch, Logstash, Vector
- Tracing: OpenTelemetry or proprietary APMs
- Metrics: Prometheus, Thanos, Datadog, or equivalent
- Have proficiency in at least one programming language (Ruby, Python, Go, Java, etc.) and a deep understanding of infrastructure as code principles
- Have experience with monitoring and observability tools
- Like troubleshooting performance issues in complex environments
- Are fluent in English
- Have experience contributing to open-source observability projects
- Have worked in a high-growth tech environment
- Are passionate about developer experience and platform engineering
- Our solutions are built on a single fully cloud-native platform that supports web and mobile app interfaces, multiple languages, and is adapted to country and healthcare specialty requirements.
- Our stack is composed of Rails, TypeScript, Java, Python, Kotlin, Swift, and React Native.
- We leverage AI ethically across our products to empower patients and health professionals. Discover our AI vision here.
- A Deutschlandticket (Germany-wide public transport pass) fully paid for by Doctolib
- 28 vacation days + 1 additional day for each full calendar year of employment (up to a maximum of 30 days)
- Work from abroad for up to 10 days per year thanks to our flexibility days policy
- Company health insurance with great supplementary benefits through our partner Allianz
- Company pension scheme (bAV) through Allianz with an employer subsidy of 40% (15% within the probationary period)
- The Doctolib Parent Care program, which includes one month additional parental leave and much more
- Enrollment in Doctolib's long-term employee value sharing plan called DoctoGrowth
- Free mental health and coaching services through our partner Moka.care
- Subsidized sports membership through our partner Urban Sports Club
- A flexible workplace policy offering both hybrid and office-based mode
- Alongside healthy snacks and our regular breakfast buffet, we provide a subsidized meal benefit
- For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support
- Relocation support in case of international mobility
- Access to the best AI tools for coding, development and dedicated training
- Recruiter Interview
- Technical SRE Interview
- System Design Interview
- Behavioral Interview
- At least one reference check
- Permanent position
- Tech stack: Kubernetes, Prometheus, OpenTelemetry, Loki, ArgoCD, Ruby, Python, Go
- Full-time
- Berlin, Germany
- Hybrid work setup (up to 2 remote days per week)
- Start date: as soon as possible
Skills Required
- 3+ years of hands-on experience on a large-scale production platform
- Proven experience with cloud platforms such as AWS, Azure, or Google Cloud
- Solid understanding of Docker and Kubernetes
- Strong understanding of Helm and ArgoCD
- Deep expertise in observability tools like Fluent Bit, OpenTelemetry, Elasticsearch, etc.
- Proficiency in at least one programming language (Ruby, Python, Go, Java)
- Experience with monitoring tools
- Fluent in English
What We Do
Since Doctolib's creation in 2013, we have had one purpose: strive for a healthier world. 1. We aim to improve the daily lives of care teams by providing them with a new generation of technologies and services. 2. We aim to improve health for all, by offering a fast and frictionless journey for all care episodes, creating new ways for people to receive care and empowering them to become actors of their health. At Doctolib, we are honored to work in the healthcare field and we believe that innovation in healthcare should be handled differently. We apply 4 guiding principles in everything we do: 1. We create helpful solutions for care teams and people. 2. We serve everyone equally and create well-designed and accessible technologies. 3. We team up with our users to strive for a healthier world and act as one team. 4. We protect our users' privacy. It’s their health, their data. To achieve our purpose, we are assembling a team dedicated to improving healthcare, with a human-centric approach and an entrepreneurial mindset. www.doctolib.com


.png)






