Site Reliability Engineering (SRE) Architect - CRL - Germany

Reposted 13 Days Ago
Be an Early Applicant
Munich, Bavaria, DEU
In-Office
Senior level
Information Technology • Software • Consulting
Transforming Enterprises To Become A Thriving Live Enterprise. AI-Powered. Digital Agility At Scale. Always-On Learning.
The Role
Lead the design and implementation of reliability strategies for production systems, ensuring high availability, performance, and resilience. Mentor teams, establish standards, and oversee incident management processes.
Summary Generated by Built In

Site Reliability Engineering (SRE) Architect – CRL – Germany

Do you want to boost your career and collaborate with expert, talented colleagues to solve and deliver against our clients' most important challenges? We are growing and are looking for people to join our team. You'll be part of an entrepreneurial, high-growth environment of 300.000 employees. Our dynamic organization allows you to work across functional business pillars, contributing your ideas, experiences, diverse thinking, and a strong mindset. Are you ready?

The Role

We are looking for a visionary and highly experienced SRE Architect to lead the design and implementation of our reliability and scalability strategy. You will be the principal architect responsible for creating the blueprint for our production systems, ensuring they are resilient, performant, and highly available. This is a senior-level role that combines deep technical expertise with strategic thinking to influence the entire engineering organization. You will define the standards and frameworks that empower our SRE and development teams to build and operate world-class services.


Key Responsibilities

  • Architectural Design & Strategy: Design and architect robust, scalable, and fault-tolerant infrastructure and application services on public cloud platforms (AWS, GCP, Azure). Define the long-term vision for system reliability and performance.
  • Reliability Frameworks: Establish and govern the standards for Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets across all engineering teams.
  • Observability & Telemetry: Architect a comprehensive observability strategy. Design the systems for logging, metrics, tracing, and alerting to provide deep insights into system health and facilitate rapid incident response.
  • Automation & Infrastructure as Code (IaC): Lead the strategy for automation and IaC. Design reusable patterns and frameworks using tools like Terraform and Ansible to ensure consistent, repeatable, and secure infrastructure provisioning.
  • Resilience & Chaos Engineering: Proactively identify and mitigate reliability risks. Design and champion the implementation of resilience patterns, disaster recovery plans, and chaos engineering experiments to validate system robustness.
  • Technical Leadership & Mentoring: Act as a thought leader and subject matter expert in reliability engineering. Mentor SREs and developers, evangelize best practices, and lead architectural review sessions to ensure reliability is a core component of every feature.
  • Incident Management Evolution: While not the primary on-call responder, you will analyze major incidents to identify architectural weaknesses and drive the necessary design changes to prevent recurrence. You will help evolve our postmortem culture and incident response capabilities.

Required Qualifications & Skills

  • Experience: 10+ years of experience in software engineering, DevOps, or systems engineering, with at least 5 years in a senior SRE or systems architecture role.
  • Cloud Expertise: Expert-level knowledge of at least one major cloud provider (AWS, GCP, or Azure), including core services like compute, storage, networking, and managed databases.
  • Containerization & Orchestration: Deep, hands-on experience designing and managing large-scale Kubernetes clusters and container-based microservices architectures.
  • Infrastructure as Code (IaC): Proven expertise in architecting infrastructure with Terraform. Proficiency with configuration management tools like Ansible, Chef, or Puppet.
  • Observability Platforms: Extensive experience designing and implementing monitoring and observability solutions using tools like Prometheus, Grafana, OpenTelemetry, Jaeger, and the ELK Stack (Elasticsearch, Logstash, Kibana) or similar commercial tools (e.g., Datadog, New Relic).
  • Programming/Scripting: Strong proficiency in a high-level programming language such as Go or Python for automation, tooling, and building system integrations.
  • Systems Design: Deep understanding of distributed systems, networking protocols (TCP/IP, HTTP), and high-availability design patterns.

Preferred Qualifications

  • Experience working across multiple cloud environments (multi-cloud).
  • Professional cloud certifications (e.g., AWS Certified Solutions Architect Professional, Google Professional Cloud Architect).
  • Experience with service mesh technologies like Istio or Linkerd.
  • Knowledge of security best practices in a cloud-native environment (DevSecOps).
  • Demonstrated experience leading large-scale technology transformations and influencing engineering culture.

About your team

Our CRL (Consumer Goods, retail & Logistics) practice helps some of the largest global firms and most recognizable local brands solve their biggest challenges in today’s age of constant disruption. With diverse services spanning growth strategy and new product innovation, to omni-channel customer experience, supply chain resiliency and AI-driven new business models, we help clients shape and achieve their growth agenda for a sustainable future. We transform traditional organizations to digitally centric business models and drive new revenue streams.

About Infosys Consulting

Be part of a globally renowned management consulting firm on the front-line of industry disruption and at the cutting edge of technology.  We work with market leading brands across sectors. Our culture is inclusive and entrepreneurial. Being a mid-size consultancy within the scale of Infosys gives us the global reach to partner with our clients throughout their transformation journey.

Our core values, IC-LIFE, form a common code that helps us move forward. IC-LIFE stands for Inclusion, Equity and Diversity, Client, Leadership, Integrity, Fairness, and Excellence. To learn more about Infosys Consulting and our values, please visit our careers page.

Within Europe, we are recognized as one of the UK’s top firms by the Financial Times and Forbes due to our client innovations, our cultural diversity and dedicated training and career paths. Infosys is on the Germany’s top employers list for 2023. Management Consulting Magazine named us on their list of Best Firms to Work for. Furthermore, Infosys has been recognized by the Top Employers Institute, a global certification company, for its exceptional standards in employee conditions across Europe for five years in a row.

We offer industry-leading compensation and benefits, along with top training and development opportunities so that you can grow your career and achieve your personal goals. Curious to learn more? We’d love to hear from you.... Apply today!

Top Skills

Ansible
AWS
Azure
Elk Stack
GCP
Go
Grafana
Jaeger
Kubernetes
Opentelemetry
Prometheus
Python
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Electronic City, Bangalore
337,000 Employees
Year Founded: 2004

What We Do

Infosys is a global leader in next-generation digital services and consulting. We enable clients in more than 50 countries to navigate their digital transformation. With over three decades of experience in managing the systems and workings of global enterprises, we expertly steer our clients through their digital journey. We do it by enabling the enterprise with an AI-powered core that helps prioritize the execution of change. We also empower the business with agile digital at scale to deliver unprecedented levels of performance and customer delight. Our always-on learning agenda drives their continuous improvement through building and transferring digital skills, expertise, and ideas from our innovation ecosystem.

Similar Jobs

CrowdStrike Logo CrowdStrike

AIDR SE Specialist (Remote)

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
3 Locations
10000 Employees

CrowdStrike Logo CrowdStrike

Horizon (Specialist Sales Organisation) Practice Lead (Remote, DEU)

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
Germany
10000 Employees

CrowdStrike Logo CrowdStrike

Sales Engineer

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
Germany
10000 Employees

CrowdStrike Logo CrowdStrike

Sr. Security Researcher, TAC TBNA (Remote)

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
2 Locations
10000 Employees

Similar Companies Hiring

Fairly Even Thumbnail
Hardware • Other • Robotics • Sales • Software • Hospitality
New York, NY
30 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account