Voodoo

Lead SRE - BeReal

Posted 24 Days Ago

Be an Early Applicant

Hiring Remotely in Paris, Île-de-France, FRA

In-Office or Remote

Senior level

Gaming • Mobile

The Role

Lead and evolve SRE practices across the platform: define SLIs/SLOs, own incident management and postmortems, design scalable reliable infrastructure on GCP, improve observability, automate infra (Terraform/Terragrunt, Kubernetes), drive FinOps, partner with squads on performance and reliability, mentor infrastructure engineers, and influence resilient architecture decisions.

Summary Generated by Built In

About BeReal

At BeReal, we are dedicated to authenticity in social media. By encouraging users to share unfiltered moments, we foster genuine connections and celebrate real life. We are now an international team of 100+ and have 40M+ monthly active users. Backed by Voodoo, our team is fully focused on scaling BeReal into an iconic social network used by hundreds of millions.

The Infrastructure team provides the backbone that powers the company’s growth, ensuring the scalability, efficiency, and reliability of our platform. We design and operate our infrastructure on GCP. Working hand in hand with developers, we enable teams to ship fast and efficiently while maintaining a strong focus on costs and performance. Our mission is to create a developer-friendly, cost-effective, and highly automated infrastructure that supports innovation at scale.

Role

Define and drive SRE practices across the organization, including SLIs, SLOs, error budgets, incident management, postmortem processes, and long-term reliability improvements across the platform
Design, implement, and optimize infrastructure for availability, scalability, reliability, and cost efficiency
Own and evolve our observability stack, improving monitoring, alerting, logging, and distributed tracing
Drive automation of infrastructure and operational workflows (e.g., Terraform, Terragrunt, Kubernetes)
Lead FinOps initiatives, developing tools and insights to optimize cloud costs
Partner closely with development squads to improve service reliability, performance, and operational excellence
Influence architectural decisions and establish best practices for building resilient distributed systems
Mentor and support Infrastructure engineers, helping raise the bar on reliability, operational excellence, and technical execution
Analyze performance bottlenecks and work on solutions such as scaling strategies, service optimizations, and system debugging

Profile

Strong knowledge of Kubernetes
Experience with high traffic, distributed systems architectures, and related tools (service discovery, config/secret management, etc.)
Strong knowledge of one Cloud provider (AWS or GCP preferred)
Proven experience defining and operating SRE practices (SLOs, incident management, observability, reliability engineering)
Strong operational mindset with experience managing production incidents and driving reliability improvements
Leadership and mentoring experience, with the ability to influence technical decisions across teams
Ownership-driven – If something isn’t working, you don’t wait for instructions; you improve it
Pragmatic and impact-oriented – You balance reliability, delivery speed, and business priorities
Performance vs cost-conscious – You make decisions that align with both technical excellence and financial sustainability

Our Stack

Operator: Kubernetes
CI/CD: Argocd, Github actions
Cloud provider: GCP
Monitoring: Datadog
Infra as code: Terraform / Terragrunt
Languages: golang / node
Datastores: Spanner / PostgreSQL / Redis

Benefits

Competitive salary based on experience
Swile Lunch voucher
Gymlib (100% covered by Voodoo)
Premium healthcare coverage with SideCare, 100% covered for you and your family
Wellness activities in our Paris office

Skills Required

Strong knowledge of Kubernetes
Experience with high-traffic distributed systems architectures and related tools (service discovery, config/secret management)
Strong knowledge of one cloud provider (AWS or GCP preferred)
Proven experience defining and operating SRE practices (SLOs, incident management, observability, reliability engineering)
Experience managing production incidents and driving reliability improvements
Experience with infrastructure-as-code and automation (Terraform, Terragrunt) and automating operational workflows
Experience owning and evolving observability stack (monitoring, alerting, logging, distributed tracing) — Datadog experience
Experience leading FinOps initiatives and optimizing cloud costs
Leadership and mentoring experience with ability to influence technical decisions across teams
Experience with CI/CD tools (ArgoCD, GitHub Actions)
Programming experience in Golang and Node.js
Experience with datastores such as Spanner, PostgreSQL, and Redis

View all jobs at Voodoo

View Voodoo Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: Paris

583 Employees

Year Founded: 2013

What We Do

Voodoo is a tech company that creates mobile games and apps. With 7 billion downloads and over 150 million monthly active users, Voodoo is the #3 mobile publisher worldwide in terms of downloads after Google and Meta. The company is one of the most impressive examples of hypergrowth in the ecosystem, having raised over $1B and backed by Goldman Sachs, Tencent, and GBL. Voodoo is now a team of over 750 employees worldwide, we’re looking for talented individuals from across the globe to join. Entertain the world with us voodoo.io/careers/jobs