Site Reliability Engineer

Posted 7 Hours Ago
Be an Early Applicant
Stockholm, SWE
Hybrid
Mid level
Music
Epidemic Sound - A Sound for Every Feeling
The Role
Build and operate the central platform (Kubernetes/GKE, controllers, Terraform), own commit-to-production (CI/CD, GitOps, ArgoCD), manage networking and IAM, improve observability (SLOs, metrics, tracing), create runbooks, and enable product teams with guardrails and self-service.
Summary Generated by Built In

Join our global force of 400+ innovators, blending the latest in tech with the greatest in soundtracking, from our Stockholm HQ to offices in London, New York, Los Angeles, Berlin, Paris, Oslo, and Seoul. We’re an industry leader with a startup mentality. We take what we do seriously, but we don’t take ourselves too seriously. Creating and collaborating to transform the sound of streaming, content, and culture. Come join us, and let the world feel your work

As a Site Reliability Engineer at Epidemic Sound, you will be a core member of the central platform team that builds and operates the platform the rest of Engineering ships on - keeping it reliable, scalable, and secure is what this team exists to do. This is infrastructure-flavoured software engineering: you will write the code that defines and automates the platform, and treat it as a product whose customers are the rest of Engineering. The goal is to make the reliable way the easy way - self-service paths that let product teams build and ship safely without waiting for anyone.

Your key responsibilities include

  • Build and operate the platform our services run on - GKE clusters, the controllers that extend them, and the Terraform that defines our cloud.

  • Own the path from commit to production - CI/CD, GitOps, and the progressive-delivery patterns that turn a merge into a safe release.

  • Strengthen the networking and routing layer - traffic management on top of the VPC, firewalls, and network policies that keep it safe and predictable.

  • Govern access and guardrails - IAM across every layer, policy-as-code, and break-glass paths - so teams move fast within safe defaults rather than waiting on tickets.

  • Grow reliability and observability - alert hygiene, runbooks, SLOs, and the metrics and tracing that show how the platform behaves in production.

  • Enable product teams and raise the bar - make production readiness the default, and drive healthy adoption of the standards and docs you would rather share than gatekeep.

Requirements

  • Kubernetes fundamentals: a solid grasp of controllers, core components, and CNI and networking - depth in the domain matters more than any single tool (GKE a plus).

  • Infrastructure as code and delivery: Terraform, Helm or Kustomize, CI/CD and GitOps (ArgoCD), and the traffic-management and progressive-delivery mechanisms that move releases out safely.

  • Networking and access: routing fundamentals, the VPC, firewall, and network-policy primitives beneath it, and IAM and access management at different levels.

  • Operational depth: monitoring fundamentals (a clear view of when to reach for metrics versus tracing, and experience with an open-source observability stack), strong troubleshooting across distributed systems, and solid Unix/Linux.

  • Agentic development mindset: you use AI agents actively in your own work, knowing where they add leverage and where human judgement is non-negotiable.

  • Collaboration and judgement: you do your best work on large, cross-cutting projects, communicate openly, and stay opinionated but open to discussion - reaching for the right tool over your own creation.

It would also be music to our ears if you have

  • Familiarity with GCP and an observability stack with Prometheus, Thanos, and Grafana.

  • Experience running containerised platforms at scale.

  • Service mesh experience with Cilium eBPF, Linkerd, or Istio.

  • Familiarity with platform building blocks like cert-manager, external-secrets, or external-dns.

Equal opportunity employer
We believe that bringing people together from different backgrounds, experiences and perspectives makes for a healthy workplace, a more successful business and a better world. We value diversity and encourage everyone to come and soundtrack the world with us.

Application
Ready to make the world feel your work? Please apply, in English.

Skills Required

  • Kubernetes fundamentals including controllers, core components, CNI and networking
  • Infrastructure as code and delivery: Terraform, Helm or Kustomize, CI/CD and GitOps (ArgoCD)
  • Traffic management and progressive-delivery mechanisms for safe releases
  • Networking and access: routing fundamentals, VPC, firewalls, network policies, and IAM
  • Operational depth: monitoring fundamentals, observability stack experience, distributed-systems troubleshooting, Unix/Linux
  • Agentic development mindset: active use of AI agents where appropriate
  • Collaboration on large cross-cutting projects and strong judgement/communication
  • Familiarity with GCP and observability stack (Prometheus, Thanos, Grafana)
  • Experience running containerised platforms at scale
  • Service mesh experience (Cilium eBPF, Linkerd, or Istio)
  • Familiarity with platform building blocks like cert-manager, external-secrets, external-dns
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Stockholm
670 Employees
Year Founded: 2009

What We Do

Epidemic Sound has transformed the soundtracking experience for global brands and professional creators, with an expansive catalog of world-class music and sound effects that's seen and heard over 2.5 billion times a day around the globe. Providing a direct license model that comes with all rights included and next-generation soundtracking tools, Epidemic Sound empowers creators to unlock more feeling in everything they create and share their stories with the world. Epidemic Sound continuously enriches its world-class catalog of music by teaming up with artists, composers, and producers to create tracks spanning all genres, while supporting them financially and creatively.

Why Work With Us

Join our global force of 500+ innovators, blending the latest in tech with the greatest in soundtracking, from our offices in Stockholm, NYC and LA. We’re an industry leader with a startup mentality. We take what we do seriously, but we don’t take ourselves too seriously. Come join us and the world will feel your work.

Gallery

Gallery

Similar Jobs

Nebius Logo Nebius

Senior Site Reliability Engineer

Artificial Intelligence • Information Technology • Consulting
In-Office or Remote
27 Locations
473 Employees

Binance Logo Binance

Senior Site Reliability Engineer

Blockchain • Fintech • Software • Cryptocurrency • Metaverse
In-Office or Remote
45 Locations
7696 Employees

Nebius Logo Nebius

Senior Site Reliability Engineer

Artificial Intelligence • Information Technology • Consulting
In-Office or Remote
30 Locations
473 Employees
In-Office or Remote
28 Locations
164 Employees

Similar Companies Hiring

Peaksware Thumbnail
Fitness • Music • Software
Louisville, CO
245 Employees
Bose Thumbnail
Automotive • eCommerce • Hardware • Music • Retail • Software • Wearables
Framingham, MA
2900 Employees
TIDAL Thumbnail
Software • News + Entertainment • Mobile • Information Technology • Music • Consumer Web
New York, NY
450 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account