Senior Site Reliability Engineering (SRE) Manager

Reposted 17 Days Ago
Be an Early Applicant
Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur
In-Office
Senior level
Fintech • Payments • Software • Financial Services
The Role
In this role, you will build and mentor an SRE team, guide software architecture, foster collaboration, drive innovation, and manage performance.
Summary Generated by Built In

ABOUT US

We’re the world’s leading provider of secure financial messaging services, headquartered in Belgium. We are the way the world moves value – across borders, through cities and overseas. No other organisation can address the scale, precision, pace and trust that this demands, and we’re proud to support the global economy. 

We’re unique too. We were established to find a better way for the global financial community to move value – a reliable, safe and secure approach that the community can trust, completely. We’re always striving to be better and are constantly evolving in an ever-changing landscape, without undermining that trust. Five decades on, our vibrant community reflects the complexity and diversity of the financial ecosystem. We innovate diligently, test exhaustively, then implement fast. In a connected and exciting era, our mission has never been more relevant. Swift now has a presence in 200+ countries and legal territories to serve a community of more than 12,000 banks and financial institutions.   

About the Role

As a Senior Site Reliability Engineering Manager, you will lead a team responsible for the reliability, observability, and automation of SWIFT’s monitoring platform that powers infrastructure, network, and synthetic monitoring. You will ensure high availability for critical services while driving an automation-first culture. This role requires hands-on experience in troubleshooting complex systems, scaling distributed platforms, and mentoring a team to own operational excellence.

Key Responsibilities

1. Team Building and Mentorship:

  • Recruit, retain, and grow engineers with expertise in monitoring, observability, and automation.
  • Mentor team members on incident response, root cause analysis, and production troubleshooting.

2. Operational Leadership:

  • Own reliability, uptime, and performance of monitoring and observability platforms.
  • Lead incident management, major incident response, and post-incident reviews.
  • Drive automation to reduce manual operational work, including runbooks and self-healing systems.

3. Collaboration and Alignment:

  • Partner with Product Owners, Engineering Leads, and cross-functional teams to align SRE priorities with business impact.
  • Promote transparency, visibility, and best practices across teams.

4. Technical Leadership:

  • Guide system design, architecture, and operational best practices for monitoring and observability platforms.
  • Advocate for automation, observability, and reliability at scale.

5. Continuous Improvement and Innovation:

  • Introduce new monitoring, observability, and automation tools.
  • Encourage knowledge sharing, learning, and innovation across teams.

What Will Make You Successful?

Professional Skills

  • Strong leadership, communication, and mentoring skills.
  • Passion for troubleshooting and operational excellence.
  • Hands-on experience with monitoring, metrics, logging, tracing, and alerting.
  • Familiarity with Agile, DevOps, and SRE practices.
  • Fluency in English.

Key Qualifications

  • 8+ years in software engineering or operations for large-scale distributed systems.
  • 5+ years managing technical teams, preferably SRE, platform, or production engineering.
  • Expertise in monitoring platforms and observability tools (ELK, Grafana, OpenTelemetry, Splunk).
  • Strong automation skills: Infrastructure as code, CI/CD for ops, scripting (Python, Go, Bash).
  • Production troubleshooting experience across software stack, networks, and infrastructure.
  • Large-scale Linux, Kubernetes, or cloud-native operations experience.
  • Proven ability to manage mission-critical services and drive reliability culture.

Additional Requirements

  • Advocate for automation-first approaches to minimize operational toil.
  • Strong sense of ownership and transparent communication style.
  • Self-motivated, curious, and proactive in improving systems and processes.

About the Team

Our SRE team tackles high-scale, high-impact challenges in monitoring, observability, and reliability. We value troubleshooting, automation-first thinking, and operational excellence. Collaboration, learning, and innovation are core to our culture.

What we offer

We put you in control of career

We give you a competitive package

We help you perform at your best

We help you make a difference

We give you the freedom to be yourself

We give you the freedom to be yourself. We are creating an environment of unique individuals – like you – with different perspectives on the financial industry and the world. A diverse and inclusive environment in which everyone’s voice counts and where you can reach your full potential.

If you believe you require a reasonable accommodation to participate in the job application or interview process, please contact us to request accommodation.

Don’t meet every single requirement? At Swift, we are dedicated to building a workplace where people can bring their full selves and ideas to the team, so if you are excited about this role, we encourage you to apply even if you do not meet every single qualification.

Top Skills

Agile
Ansible
Chef
DevOps
Kubernetes
Linux
Puppet
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York, NY
4,765 Employees
Year Founded: 1973

What We Do

SWIFT is a global member-owned cooperative and the world’s leading provider of secure financial messaging services.

We provide our community with a platform for messaging and standards for communicating, and we offer products and services to facilitate access and integration, identification, analysis and regulatory compliance.

Our messaging platform, products and services connect more than 11,000 banking and securities organisations, market infrastructures and corporate customers in more than 200 countries and territories.

SWIFT also brings the financial community together – at global, regional and local levels – to shape market practice, define standards and debate issues of mutual interest or concern.

For more information, visit www.swift.com or follow us on Twitter: @swiftcommunity

Similar Jobs

Capco Logo Capco

Data Modeller

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Hybrid
Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur, MYS
6000 Employees

Capco Logo Capco

Data Architect

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Hybrid
Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur, MYS
6000 Employees

Capco Logo Capco

Business Analyst

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Hybrid
Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur, MYS
6000 Employees

Airwallex Logo Airwallex

Talent Acquisition Partner

Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
In-Office
Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur, MYS
2000 Employees

Similar Companies Hiring

Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY
Citizens Bank Thumbnail
Fintech
US
17000 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account