Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential.
Title and Summary
Lead Site Reliability Engineer
Role Overview
We are seeking a highly technical Lead Site Reliability Engineer (SRE) to architect, engineer, and operate highly reliable, scalable, and secure platforms across multi-cloud (AWS, Azure) and hybrid (on-prem + cloud) environments.
This is a deeply hands-on engineering role requiring expertise in distributed systems, Kubernetes, hybrid networking, automation, CI/CD, observability, and production incident leadership. The Lead SRE will serve as the technical authority for reliability across interconnected cloud and datacenter ecosystems.
Core Responsibilities
1. Reliability Engineering Across Hybrid & Multi-Cloud• Define and implement SLIs, SLOs, and error budgets across cloud-native and on-prem workloads.• Architect high-availability designs spanning:
o AWS and Azure regions
o On-prem datacenters
o Cross-cloud failover patterns• Design DR strategies (RTO/RPO driven) across hybrid environments.• Eliminate single points of failure across network, compute, storage, and DNS layers.• Conduct resilience validation, chaos testing, and failure scenario modeling.
2. Multi-Cloud Architecture & Engineering• Engineer and operate workloads across:
o Amazon Web Services
o Microsoft Azure• Design cross-cloud networking (VPN, ExpressRoute, Direct Connect, Transit Gateway).• Implement workload portability and cloud-agnostic deployment strategies.• Optimize cost, performance, and reliability across providers.• Design cloud-native autoscaling, load balancing, and traffic routing strategies.
3. Hybrid Infrastructure (On-Prem + Cloud Integration)• Integrate on-prem infrastructure with cloud platforms using:
o Active Directory / IAM federation
o Hybrid DNS architecture
o Secure certificate lifecycle management• Troubleshoot hybrid connectivity issues (BGP routing, firewall policies, NAT, MTU mismatches).• Manage hybrid Kubernetes deployments and private registry integrations.• Support legacy-to-cloud modernization initiatives.
4. Kubernetes & Container Platform Engineering• Architect and operate:
o Amazon EKS
o Azure Kubernetes Service
o Self-managed Kubernetes clusters (on-prem)• Optimize cluster autoscaling, resource allocation, and performance.• Implement cluster security hardening and RBAC governance.• Troubleshoot CNI, ingress controllers, service mesh, and pod networking issues.• Implement GitOps-driven deployments.
5. Observability Engineering Across Distributed Systems• Build unified observability across hybrid environments using:
o Splunk
o Dynatrace
o Prometheus
o Grafana
o OpenTelemetry• Implement centralized logging across cloud and on-prem workloads.• Design distributed tracing across multi-cloud microservices.• Engineer proactive alerting to reduce MTTR and improve signal quality.
6. CI/CD & Infrastructure Automation• Engineer resilient CI/CD pipelines (Jenkins, GitHub Actions, Azure DevOps).• Implement cross-cloud infrastructure as code using:
o Terraform
o CloudFormation• Automate:
o Certificate rotation
o Auto-scaling policies
o Patch orchestration
o Drift detection• Improve deployment reliability via blue-green and canary strategies.
7. Advanced Production Troubleshooting• Lead technical investigation of:
o DNS resolution failures (private/public zones, hybrid forwarding)
o TLS/PKI certificate failures
o Network latency across hybrid circuits
o Memory leaks & kernel-level issues
o Thread contention & CPU throttling• Perform packet-level debugging (tcpdump, netstat, traceroute).• Analyze distributed system failures spanning multiple platforms.
Technical Skills Required• 7-10+ years in SRE / DevOps / Cloud Engineering roles.• Deep hands-on experience in:
o AWS and Azure
o Hybrid networking
o Kubernetes (cloud & on-prem)• Strong knowledge of:
o Linux internals
o TCP/IP, DNS, Load Balancing
o TLS/PKI and certificate lifecycle
o Distributed systems architecture• Strong scripting/programming skills (Python preferred).• Experience designing cross-cloud DR and failover models.• Experience with infrastructure as code and GitOps.
Preferred Certifications• AWS Solutions Architect (Associate/Professional)• Azure Architect / DevOps Engineer• Certified Kubernetes Administrator (CKA)
Work Schedule Requirement
This role supports globally distributed, business-critical systems operating 24x7.
The candidate must be willing to participate in rotational on-call shifts, including weekends and off-hours support, as part of a follow-the-sun enterprise support model.
Key Success Metrics• Improved cross-cloud resiliency and DR posture.• Reduced hybrid networking incidents.• Improved SLO compliance across platforms.• Measurable MTTR reduction.• Increased automation coverage.• Reduced change failure rate.
Corporate Security Responsibility
All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:
- Abide by Mastercard's security policies and practices;
- Ensure the confidentiality and integrity of the information being accessed;
- Report any suspected information security violation or breach, and
- Complete all periodic mandatory security trainings in accordance with Mastercard's guidelines.
Top Skills
What We Do
Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we’re building a resilient economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential.
Why Work With Us
We live the Mastercard Way: creating value in the communities we touch, growing together through the opportunities we see, and moving fast to innovate and scale. Our collaborative culture and our passionate people are the key to what we do, driving meaningful change as one team and connecting everyone to priceless possibilities.
Gallery
Mastercard Teams
Mastercard Offices
Hybrid Workspace
Employees engage in a combination of remote and on-site work.
In our ongoing workplace evolution, we’ve introduced hybrid work, Work-From-Elsewhere Weeks and Meeting-Free Days.













