Senior SRE / Platform Engineer (m/f/d)

Posted 3 Days Ago
Be an Early Applicant
Hiring Remotely in Munich, Bayern, DEU
In-Office or Remote
Senior level
Software
The Role
Senior SRE/Platform Engineer to own and advance cloud infrastructure for a browser-based simulation platform: evolve Kubernetes platform, drive OpenTelemetry and SLO adoption, design multi-region/data-residency and disaster-recovery, optimize petabyte-scale cloud cost and efficiency, and build self-service AWS provisioning, guardrails, and developer tooling while ensuring security and compliance.
Summary Generated by Built In
The Role

We are looking for a Senior SRE / Platform Engineer (m/f/d) to own and improve the cloud infrastructure behind SimScale's browser-based simulation platform. The role spans AWS and EKS, observability, disaster recovery, security and compliance controls, multi-region architecture, elastic GPU/HPC capacity, and internal developer tooling.

SimScale's engineering teams run workloads directly on AWS; you will build the standards, guardrails, and self-service tooling that let them do so safely, raising reliability and security without slowing engineering velocity. You will join a small, tightly knit infrastructure team supporting 50+ engineers across the company. This is a hands-on senior individual contributor role; people management is not required, but there is a genuine path toward tech-lead ownership as the team grows.

Your Opportunity
  • Evolve our Kubernetes platform: Evaluate and adopt technologies such as Kubernetes Gateway API and service mesh patterns, and coordinate platform evolution across 10+ engineering teams.
  • Take observability to the next level: Drive organization-wide adoption of OpenTelemetry for distributed tracing and metrics, and help teams define meaningful SLOs.
  • Shape multi-region architecture and data residency: Support our move from an EU-centered footprint toward a global, multi-cloud architecture that satisfies disaster-recovery and data-residency requirements.
  • Own cloud cost and efficiency at scale: Keep petabyte-scale infrastructure cost-efficient, secure, and well-instrumented.
  • Improve tooling: Build self-service AWS account provisioning, guardrails and AI-assisted automations that help engineering teams manage infrastructure safely and efficiently at scale.
What We Expect from You
  • 5+ years of professional experience in SRE, platform, or infrastructure engineering.
  • Software development experience: Your background is rooted in software development, and you moved into SRE from there. You write production-quality software in at least one of Python, Go, Rust, or Java.
  • Strong systems foundation: You understand Linux internals and distributed systems well enough to debug complex production behavior.
  • Hands-on cloud and infrastructure experience: AWS (or GCP), declarative infrastructure (Terraform), gitops-workflow (ArgoCD) and container orchestration (Kubernetes).
  • Observability and reliability experience: You have worked with OpenTelemetry, Prometheus, distributed tracing, monitoring, and meaningful SLOs/SLIs.
  • Production debugging depth: You can investigate complex failures, communicate clearly during incidents, and turn findings into durable improvements.
  • Security and compliance awareness: You understand how infrastructure decisions affect access control, auditability, disaster recovery, logging, and standards such as SOC 2.
  • Clear communication: You can explain trade-offs to engineering teams and help others adopt better platform practices without unnecessary friction.
Bonus Points
  • An open source portfolio or contributions.
  • Prior technical leadership experience, especially in infrastructure, reliability, or platform engineering.

Location: Remote (within CET ±5h)

What you can expect from us 
  • Join a dedicated, supportive team with unlimited growth opportunities and leadership potential
  • Make an impact quickly by sharing ideas and contributing to creative, goal-oriented projects
  • Work in a diverse, inclusive environment with colleagues from over 35 countries
  • Enjoy flexible hours and the freedom to work remotely from anywhere in the world
  • Access comprehensive health coverage, retirement plans, paid time off, and wellness support
  • Enjoy fresh office lunches or gift cards as a remote employee
  • Grow as a professional with online/offline learning, language courses, and tech talks
  • Connect at team events, join support groups, and contribute to our ESG and DE&I initiatives
  • Participate in fun team challenges and competitions for added excitement and team spirit

Diversity, Equity and Inclusion at SimScale

At SimScale, we look beyond borders and hire great talent from all parts of the world. With our team consisting of people from various backgrounds, we truly embrace diversity and encourage everyone to be themselves. We are unified by curiosity, dedication and our team spirit! As an equal opportunity employer, we acknowledge that our employees have different aspirations and career goals, and therefore are committed to create a diverse environment. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. A copy of SimScale's full recruiting guideline can be made available on request. Kindly let us know how you would like to be addressed and whether you have specific requirements for the interview. 

Skills Required

  • 5+ years of professional experience in SRE, platform, or infrastructure engineering
  • Production-quality software development experience in at least one of Python, Go, Rust, or Java
  • Strong systems foundation: Linux internals and distributed systems debugging
  • Hands-on cloud and infrastructure experience (AWS or GCP)
  • Declarative infrastructure experience (Terraform)
  • GitOps workflow experience (ArgoCD)
  • Container orchestration experience (Kubernetes, EKS)
  • Observability and reliability experience (OpenTelemetry, Prometheus, distributed tracing, SLOs/SLIs)
  • Production debugging and incident response experience
  • Security and compliance awareness (access control, auditability, disaster recovery, SOC 2)
  • Clear communication and ability to advise engineering teams
  • Remote work within CET ±5h
  • Open source portfolio or contributions
  • Prior technical leadership experience in infrastructure, reliability, or platform engineering
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Munich
149 Employees
Year Founded: 2012

What We Do

SimScale enables engineering teams to access accurate and fast simulation, on their terms, without compromise. We make engineering simulation technically and economically accessible from everywhere, at any time, and at any scale, in the cloud. We deliver instant access to fluid, thermal, structural, and electromagnetic simulation to hundreds of thousands of users worldwide. With SimScale, high-fidelity multiphysics simulation has moved from a complex and cost-prohibitive desktop application to an inclusive, agile, cloud-native engineering simulation platform. SimScale is a SaaS company that follows a subscription-based pricing model, visit http://simscale.com for more information.

Similar Jobs

SharkNinja Logo SharkNinja

Visual Merchandiser - North Rhine-Westphalia based

Beauty • Robotics • Design • Appliances • Manufacturing
Remote
Germany
4000 Employees

Vercel Logo Vercel

Support Engineer

Artificial Intelligence • Cloud • Software
Easy Apply
Remote or Hybrid
2 Locations

CrowdStrike Logo CrowdStrike

Architect

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
Germany
10000 Employees

CrowdStrike Logo CrowdStrike

Sr. Security Researcher II, Persona Ops (Remote)

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
4 Locations
10000 Employees

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account