Job Title, Company or Keyword

Maximum of 25 job preferences reached.

Top Site Reliability Engineer Jobs

Arctiq

Site Reliability Engineer

Reposted 21 Hours AgoSaved

In-Office or Remote

San Diego, CA, USA

Mid level

Information Technology

The Site Reliability Engineer will implement reliability engineering practices, develop automation, maintain CI/CD pipelines, and ensure system health through monitoring.

Top Skills: AnsibleAWSAzureBashDockerElk StackGCPGoGrafanaKubernetesPrometheusPythonTerraform

GEICO

Principal Product Manager, Reliability Platform (Observability, SRE, AIM)

Reposted 21 Hours AgoSaved

In-Office

2 Locations

147K-230K Annually

Senior level

147K-230K Annually

Senior level

Insurance

The Principal Product Manager will lead the development of reliability platforms, focusing on observability, incident management, and system availability while fostering a culture of operational excellence across engineering teams.

Top Skills: AWSAzureCloud InfrastructureDeveloper ToolsGrafanaKubernetesObservabilitySite Reliability Engineering

Quest Diagnostics

Epic Site Reliability Engineer II

Reposted 21 Hours AgoSaved

In-Office

Secaucus, NJ, USA

80K-115K Annually

Mid level

80K-115K Annually

Mid level

Healthtech • Database

Responsible for reliability engineering, monitoring system performance, automating processes, and collaborating with development teams to enhance operational efficiency.

Top Skills: AWSAzureBashCi/CdCloudFormationDockerDynatraceGCPGoJmeterKubernetesNeoloadPythonSplunkTerraform

Old Mission

Site Reliability Engineer

Reposted 21 Hours AgoSaved

In-Office

2 Locations

175K-225K Annually

Mid level

175K-225K Annually

Mid level

Fintech • Payments • Financial Services

The Site Reliability Engineer will automate processes, manage server deployments, and collaborate with teams to enhance operational workflows in a trading environment.

Top Skills: AnsibleC++ChefCloud InfrastructureDistributed SystemsDockerGoGrafanaHashicorp NomadHpc ClustersKubernetesLinuxPerlPodmanPrometheusPuppetPythonRancherRustSalt

Offchain Labs

Site Reliability Engineer

Reposted YesterdaySaved

Remote

United States

Mid level

Blockchain • Software

Build, operate, and scale production Kubernetes infrastructure using GitOps and declarative IaC. Design CI/CD workflows, observability, and secure-by-default systems. Troubleshoot networking/storage, participate in on-call rotations, automate operational workflows, and drive postmortems and reliability improvements.

Top Skills: ArbitrumArgocdArgocd ApplicationsetsAWSAzureBashCloudwatchCodebuildGCPGithub ActionsGitopsGoGrafanaK9SKubernetesLinuxLokiMimirPrometheusPrysmPythonTerraformYamlZerodev

Go Rentals

Systems Reliability Engineer (SRE)

YesterdaySaved

In-Office

92660-6419, Newport Beach, CA, USA

150K-160K Annually

Mid level

150K-160K Annually

Mid level

Transportation • Travel • Hospitality

Ensure reliability, scalability, performance, and availability of production systems by monitoring, incident response, root cause analysis, automation, IaC, container orchestration, observability, and partnering with engineering to improve deployment and operational practices. Participate in on-call rotations and maintain runbooks and operational standards.

Top Skills: AWSAzureBashCi/CdCloudFormationDockerGCPGoJavaKubernetesLinuxPulumiPythonTerraformUnix

Accenture

Site Reliability Engineer 6159186

Reposted YesterdaySaved

In-Office

Dallas, TX, USA

50-53 Hourly

Senior level

50-53 Hourly

Senior level

Information Technology

Provide level-4 SWAT support for APSRE, perform production and lower-lane triage, execute restoral steps, identify root causes, and collaborate with ITIL and partner teams to improve environment stability.

Top Skills: ApsreItil

Bank of America

Site Reliability Engineer Lead

YesterdaySaved

In-Office

Plano, TX, USA

Senior level

Big Data • Fintech • Mobile • Payments • Financial Services • Data Privacy

Lead SRE partnering with development and infrastructure teams to implement monitoring, automation, reliability tooling, alerting, and on-call routines; develop reliability scripts and libraries; triage major incidents; reduce toil and improve observability; decompose work and mentor SRE resources.

Top Skills: AnsibleCi/Cd PipelinesConfiguration Management SystemsGoIdentity SystemsInfrastructure As Code (Iac)Monitoring SystemsNetworkingPythonService Mesh PlatformsTerraformVirtualization

Mastercard

Senior Site Reliability Engineer

Reposted 7 Days AgoSaved

Hybrid

O'Fallon, MO, USA

96K-163K Annually

Senior level

96K-163K Annually

Senior level

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing

The Senior BizOps Engineer is responsible for ensuring platform stability and resilience, guiding teams in product development, and facilitating operational excellence throughout the software lifecycle.

Top Skills: ArtifactoryBitbucketCC++ChefDynatraceGitGoJavaJenkinsMavenOraclePerlPl/SqlPostgresPythonRubySplunkSQL

Mastercard

Senior Site Reliability Engineer

Reposted 7 Days AgoSaved

Hybrid

O'Fallon, MO, USA

96K-163K Annually

Senior level

96K-163K Annually

Senior level

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing

The Senior BizOps Engineer role involves improving service lifecycles, supporting CI/CD pipelines, and engaging in DevOps automation practices. Responsibilities include system design consulting, operational feedback, incident response, and mentoring junior resources.

Top Skills: ArtifactoryBitbucketCC++ChefGitGoJavaJenkinsMavenPerlPythonRuby

Mastercard

Senior Site Reliability Engineer

7 Days AgoSaved

Hybrid

O'Fallon, MO, USA

Senior level

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing

Drive reliability, scalability, and performance of Mastercard applications as the production-readiness steward. Implement observability, automation, capacity planning, and monitoring. Support incident triage, root cause analysis and blameless post-mortems. Collaborate with developers to embed operational design, enforce standards, improve CI/CD, container orchestration, and cloud infrastructure, while managing risk, compliance, and continuous improvement.

Top Skills: AWSAzureBashCi/CdContainerizationGCPGoLinux/UnixMonitoring/ObservabilityOrchestrationPython

Mastercard

Senior Site Reliability Engineer

Reposted 7 Days AgoSaved

Hybrid

O'Fallon, MO, USA

96K-163K Annually

Senior level

96K-163K Annually

Senior level

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing

The role involves ensuring application health, automating deployment processes, leading DevOps initiatives, and fostering collaboration between development and operations teams to maintain system resilience and minimize downtime.

Top Skills: AutomationCi/CdDevOpsMonitoringScriptingSoftware Design

New

Track Smarter, Apply Better.

Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.

Use For Free

CrowdStrike

Sr. SRE II - Agentic SOC, NG-SIEM (Hybrid)

Reposted 7 Days AgoSaved

Hybrid

Austin, TX, USA

160K-250K Annually

Senior level

160K-250K Annually

Senior level

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity

In this role, you'll ensure the reliability and scalability of the NG-SIEM platform, manage incident responses, and collaborate across teams to enhance system performance.

Top Skills: BashC++GoJavaKafkaPythonRust

Tabs

Staff Site Reliability Engineer

Reposted YesterdaySaved

In-Office

New York, NY, USA

200K-250K Annually

Expert/Leader

200K-250K Annually

Expert/Leader

Payments • Software • Automation

Lead platform and infrastructure direction on AWS, evolve CI/CD and ephemeral environments, set observability and SLO standards, drive incident response and postmortems, mentor engineers, and build automation to reduce operational risk.

Top Skills: AWSCi/CdDistributed SystemsEcsEphemeral Environments/Preview DeploysFargateGithub ActionsLogsObservability (MetricsSlos/Slis/Error BudgetsTracing)

Tekmetric

Site Reliability Engineer

Reposted YesterdaySaved

Remote

United States

Senior level

Automotive

Design and implement scalable cloud infrastructure, monitor performance, automate processes, ensure security and compliance, and lead a DevOps team.

Top Skills: AWSBashCi/CdDockerElk StackGCPGrafanaKubernetesPrometheusPythonTerraform

CME Group

Staff Site Reliability Engineer

Reposted YesterdaySaved

In-Office

Wacker, IL, USA

132K-220K Annually

Senior level

132K-220K Annually

Senior level

Financial Services

As a Staff Site Reliability Engineer, you will enhance system reliability, architect solutions, drive automation, and implement SRE principles within development processes.

Top Skills: BashChefCloudFormationGkeGoJavaOpentelemetryPrometheusPythonRustTerraformTypescript

Momentum Engineering, Inc.

ME00528-Site Reliability Engineer 3

Reposted YesterdaySaved

In-Office

Annapolis Junction, MD, USA

165K-230K Annually

Expert/Leader

165K-230K Annually

Expert/Leader

Information Technology • Software • Automation

The Senior Site Reliability Engineer will manage AWS environments, develop Infrastructure as Code, and automate operational tasks to ensure high availability in cloud systems.

Top Skills: Amazon Web Services (Aws)AnsibleAws Certified Developer-AssociateAws Certified Solutions Architect-AssociateAws Certified Solutions Architect-ProfessionalAws Certified Sysops Administrator-AssociateCertified Kubernetes Administrator (Ckad)Ci/CdDockerElastic Certified EngineerElastic Certified Observability EngineerKubernetesTerraform

Workday

Software Development Engineer, SRE (US Federal)

Reposted YesterdaySaved

In-Office

Reston, VA, USA

124K-222K Annually

Mid level

124K-222K Annually

Mid level

Cloud • Fintech • HR Tech

Support U.S. federal government contracts by managing operations of services. Collaborate with development teams to enhance architecture and ensure service reliability.

Top Skills: Cloud InfrastructureDistributed SystemsIac ToolsObservabilityProgramming Languages

Mastercard

Senior Site Reliability Engineer

Reposted 7 Days AgoSaved

Hybrid

O'Fallon, MO, USA

Senior level

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing

The Senior Site Reliability Engineer will enhance service reliability, implement CI/CD using various tools, automate processes, and mentor junior resources.

Top Skills: ArtifactoryBitbucketCC++ChefGitGoJavaJenkinsMavenPerlPythonRuby

Saviynt

Principal Site Reliability Engineer, Google Cloud

2 Days AgoSaved

Hybrid

2 Locations

240K-250K Annually

Expert/Leader

240K-250K Annually

Expert/Leader

Software

Define and drive reliability for Saviynt's SaaS platform by designing, building, and operating scalable, reusable platform services. Lead Kubernetes platform engineering, multi-region cloud architectures, event-driven systems, CI/CD pipelines, observability, service mesh, and shared relational data services. Provide tooling, APIs, on-call support, and cross-team guidance.

Top Skills: ArgocdAWSAzureDatadogElk (Elasticsearch/Logstash/Kibana)EnvoyGCPGitlab CiGoGoogle Pub/SubGrafanaIstioKafkaKubernetesMySQLNatsPostgresPrometheusPythonRabbitmq (Rmq)Restful ApisService Mesh

StackBlitz

Staff Site Reliability Engineer

2 Days AgoSaved

Remote

USA

Senior level

Software • Web3

Lead reliability practices across teams: embed early in projects, define SLIs/SLOs, build multi-cloud paved roads with Terraform, run on-call, drive org-wide incident maturity and tooling.

Top Skills: AWSAzureGCPRuby On RailsTerraformTypescriptWebcontainers

Solventum

Site Reliability Engineer

2 Days AgoSaved

Remote

2 Locations

124K-171K Annually

Senior level

124K-171K Annually

Senior level

Healthtech • Pharmaceutical • Manufacturing

Support and maintain production Core Speech systems: deploy, monitor, alert, perform capacity planning, respond to on-call incidents, and drive system performance and architecture improvements.

Top Skills: AnsibleAws CloudfrontAws DocumentdbAws Ec2Aws EfsAws EksAws RdsAws S3ContainerdDockerElasticsearchFilebeatGitGitGitlabGoGocdGrafanaJavaJythonKibanaKubernetesLogstashMongoDBPostgresPythonRedisShellSolrTerraform

Barclays

SRE Virtual Desktop Operations Engineer - AVP

2 Days AgoSaved

In-Office

New York, NY, USA

120K-175K Annually

Senior level

120K-175K Annually

Senior level

Fintech • Financial Services

Design, build, and maintain reliable, scalable virtual desktop infrastructure (VDI) and supporting platforms. Lead incident response, automate deployments and operations with IaC and CI/CD, implement secure configurations, monitor system health, collaborate cross-functionally, and drive continuous improvement and operational excellence.

Top Skills: Active DirectoryAnsibleArm/BicepAzure DevopsCitrix CloudCitrix GatewayCvadDnsDscGithub ActionsGitlab CiGposJenkinsPowershellSsl/Tls CertificatesTerraformVdi Profile ManagementWindows 11 Multi-SessionWindows Server

Morgan Stanley

Site Reliability Engineer - Application Support (Director)

Reposted 2 Days AgoSaved

In-Office

New York, NY, USA

120K-165K Annually

Senior level

120K-165K Annually

Senior level

Fintech • Financial Services

The SRE Application Support Engineer is responsible for ensuring operational reliability, stability, and optimizing performance of production systems, managing outages, troubleshooting issues, and developing documentation and standards for production applications.

Top Skills: AuroraAWSEc2EcsFargateGrafanaJavaKibanaLambdaPostgresPrometheusPythonS3Splunk

SpaceX

Site Reliability Engineer — HPC & Automation (Silicon Engineering)

2 Days AgoSaved

In-Office

Redmond, WA, USA

125K-175K Annually

Junior

125K-175K Annually

Junior

Aerospace • Other

Design, operate, scale, and automate HPC clusters and services for silicon design workflows. Manage infrastructure-as-code, CI/CD pipelines, observability, and storage automation. Collaborate with cross-functional teams to eliminate performance bottlenecks and accelerate simulation and regression turnaround times.

Top Skills: AnsibleAnsysBambooBashCadenceClaude CodeDockerGrafanaGrokJenkinsKeysightKubernetesLinuxLsfMySQLNetapp OntapNfsPostgresPrometheusPuppetPythonRest ApiSiemensSlurmSqliteSynopsysTcp/IpTerraform