Job Title, Company or Keyword

Maximum of 25 job preferences reached.

Top Site Reliability Engineer Jobs

Workday

Software Development Engineer, SRE (US Federal)

Reposted 2 Days AgoSaved

In-Office

Reston, VA, USA

124K-222K Annually

Mid level

124K-222K Annually

Mid level

Cloud • Fintech • HR Tech

Support U.S. federal government contracts by managing operations of services. Collaborate with development teams to enhance architecture and ensure service reliability.

Top Skills: Cloud InfrastructureDistributed SystemsIac ToolsObservabilityProgramming Languages

Mastercard

Senior Site Reliability Engineer

Reposted 8 Days AgoSaved

Hybrid

O'Fallon, MO, USA

Senior level

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing

The Senior Site Reliability Engineer will enhance service reliability, implement CI/CD using various tools, automate processes, and mentor junior resources.

Top Skills: ArtifactoryBitbucketCC++ChefGitGoJavaJenkinsMavenPerlPythonRuby

Saviynt

Principal Site Reliability Engineer, Google Cloud

3 Days AgoSaved

Hybrid

2 Locations

240K-250K Annually

Expert/Leader

240K-250K Annually

Expert/Leader

Software

Define and drive reliability for Saviynt's SaaS platform by designing, building, and operating scalable, reusable platform services. Lead Kubernetes platform engineering, multi-region cloud architectures, event-driven systems, CI/CD pipelines, observability, service mesh, and shared relational data services. Provide tooling, APIs, on-call support, and cross-team guidance.

Top Skills: ArgocdAWSAzureDatadogElk (Elasticsearch/Logstash/Kibana)EnvoyGCPGitlab CiGoGoogle Pub/SubGrafanaIstioKafkaKubernetesMySQLNatsPostgresPrometheusPythonRabbitmq (Rmq)Restful ApisService Mesh

StackBlitz

Staff Site Reliability Engineer

3 Days AgoSaved

Remote

USA

Senior level

Software • Web3

Lead reliability practices across teams: embed early in projects, define SLIs/SLOs, build multi-cloud paved roads with Terraform, run on-call, drive org-wide incident maturity and tooling.

Top Skills: AWSAzureGCPRuby On RailsTerraformTypescriptWebcontainers

Solventum

Site Reliability Engineer

3 Days AgoSaved

Remote

2 Locations

124K-171K Annually

Senior level

124K-171K Annually

Senior level

Healthtech • Pharmaceutical • Manufacturing

Support and maintain production Core Speech systems: deploy, monitor, alert, perform capacity planning, respond to on-call incidents, and drive system performance and architecture improvements.

Top Skills: AnsibleAws CloudfrontAws DocumentdbAws Ec2Aws EfsAws EksAws RdsAws S3ContainerdDockerElasticsearchFilebeatGitGitGitlabGoGocdGrafanaJavaJythonKibanaKubernetesLogstashMongoDBPostgresPythonRedisShellSolrTerraform

Barclays

SRE Virtual Desktop Operations Engineer - AVP

3 Days AgoSaved

In-Office

New York, NY, USA

120K-175K Annually

Senior level

120K-175K Annually

Senior level

Fintech • Financial Services

Design, build, and maintain reliable, scalable virtual desktop infrastructure (VDI) and supporting platforms. Lead incident response, automate deployments and operations with IaC and CI/CD, implement secure configurations, monitor system health, collaborate cross-functionally, and drive continuous improvement and operational excellence.

Top Skills: Active DirectoryAnsibleArm/BicepAzure DevopsCitrix CloudCitrix GatewayCvadDnsDscGithub ActionsGitlab CiGposJenkinsPowershellSsl/Tls CertificatesTerraformVdi Profile ManagementWindows 11 Multi-SessionWindows Server

Morgan Stanley

Site Reliability Engineer - Application Support (Director)

Reposted 3 Days AgoSaved

In-Office

New York, NY, USA

120K-165K Annually

Senior level

120K-165K Annually

Senior level

Fintech • Financial Services

The SRE Application Support Engineer is responsible for ensuring operational reliability, stability, and optimizing performance of production systems, managing outages, troubleshooting issues, and developing documentation and standards for production applications.

Top Skills: AuroraAWSEc2EcsFargateGrafanaJavaKibanaLambdaPostgresPrometheusPythonS3Splunk

SpaceX

Site Reliability Engineer — HPC & Automation (Silicon Engineering)

3 Days AgoSaved

In-Office

Redmond, WA, USA

125K-175K Annually

Junior

125K-175K Annually

Junior

Aerospace • Other

Design, operate, scale, and automate HPC clusters and services for silicon design workflows. Manage infrastructure-as-code, CI/CD pipelines, observability, and storage automation. Collaborate with cross-functional teams to eliminate performance bottlenecks and accelerate simulation and regression turnaround times.

Top Skills: AnsibleAnsysBambooBashCadenceClaude CodeDockerGrafanaGrokJenkinsKeysightKubernetesLinuxLsfMySQLNetapp OntapNfsPostgresPrometheusPuppetPythonRest ApiSiemensSlurmSqliteSynopsysTcp/IpTerraform

SWIFT

Lead Site Reliability Engineer – Managed Patching

Reposted 3 Days AgoSaved

In-Office

Manassas, VA, USA

122K-226K Annually

Expert/Leader

122K-226K Annually

Expert/Leader

Fintech • Payments • Software • Financial Services

Lead the design and implementation of an automated patching service, ensuring reliability and compliance while driving continuous improvement and cross-functional collaboration.

Top Skills: Ansible Automation PlatformCi/Cd OrchestrationCloudbeesLinuxPower BIPythonRhelServicenow

BTIG

Technology, DevOps/Site Reliability Engineer

3 Days AgoSaved

In-Office

San Francisco, CA, USA

85K-115K Annually

Mid level

85K-115K Annually

Mid level

Financial Services

Provide frontline desktop support for employees (remote and in-person), triage and resolve hardware, Windows, application, phone, and market-data feed issues, manage tickets, perform firmware/patch deployments, and collaborate with IT teams. Support trading desk/C-suite users and maintain endpoint security and configuration management.

Top Skills: Active DirectoryBiometric DevicesBloombergCisco Phone SystemsCisco PhonesData EncryptionEndpoint ManagerFidessaFirmware UpdatesGlobal RelayIceMicrosoft Office/Office 365Ms-900OnedrivePatch ManagementPrintersRedi+ScannersServicenowSoftphonesSpyware/Malware ToolsSystem Center Configuration ManagerThomson ReutersTrading TurretsVpnWifiWindows 10Windows 11Zoom

Mastercard

Senior Site Reliability Engineer

Reposted 8 Days AgoSaved

Hybrid

O'Fallon, MO, USA

96K-163K Annually

Senior level

96K-163K Annually

Senior level

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing

The Senior Site Reliability Engineer supports app and service operations, focusing on lifecycle management, system health, and automation. Responsibilities include incident response, CI/CD pipeline support, and mentoring junior resources.

Top Skills: Apache NifiCC++DynatraceGitGoJavaJenkinsPerlPythonRubyShell ScriptingSplunkSQLUnixXlr

CoreWeave

Senior Site Reliability Engineer, Data Infrastructure

Reposted 8 Days AgoSaved

In-Office

2 Locations

165K-242K Annually

Senior level

165K-242K Annually

Senior level

Cloud • Information Technology • Machine Learning

As a Senior Site Reliability Engineer, you'll ensure the reliability and performance of a Kubernetes-based data platform, focusing on scaling infrastructure, enhancing security, and optimizing deployment processes.

Top Skills: AirflowArgo CdFlinkGithub ActionsGrafanaHelmIstioKafkaKubernetesLinkerdOpentelemetryPrometheusPulumiSparkTerraform

New

Cut your apply time in half.

Use ourAI Assistantto automatically fill your job applications.

Use For Free

Fidelity Investments

Principal Site Reliability Engineer

3 Days AgoSaved

In-Office

Durham, NC, USA

Senior level

Fintech

Lead enterprise reliability strategy and architect resilient cloud and on-prem platforms. Manage DR events, observability (Datadog, Splunk), SSL lifecycle, performance testing (Java/JMeter), CI/CD automation, and Kubernetes operations. Advise leadership, mentor engineers, perform complex root-cause analysis, and build performance test frameworks with actionable reporting.

Top Skills: Ansible AwxAviAWSAws Route53AzureAzure Load BalancerCi/CdCloud-TestDatadogDeployment As A Service (Daas)F5GrafanaJavaJenkins CoreJmeterKubernetesPythonRush-HourShellSplunkTerraformUdeploy

Workday

Software Development Engineer, SRE (US Federal)

3 Days AgoSaved

In-Office

Reston, VA, USA

124K-222K Annually

Mid level

124K-222K Annually

Mid level

Cloud • Fintech • HR Tech

Operate and support production services as an SRE: drive reliability, performance, capacity planning, observability, automation (IaC), and incident response. Partner with development and infrastructure teams, handle on-call duties, and work on federal contracts requiring US-citizen personnel and clearance eligibility.

Top Skills: Capacity PlanningCasp+Comptia Cysa+Distributed SystemsDod 8570/8140GicspIat Level IiIncident ManagementInfrastructure As Code (Iac) ToolsLoggingObservability (Metrics CollectionPublic CloudTracing)

WEX Inc.

Site Reliability Engineer 1

3 Days AgoSaved

In-Office

4 Locations

75K-95K Annually

Entry level

75K-95K Annually

Entry level

Fintech • Payments

Entry-level Site Reliability Engineer supporting system reliability, monitoring, incident triage, and root-cause analysis. Develop basic automation and scripts, follow deployment/change processes, collaborate with senior engineers, and contribute to observability and incident/problem management to improve system resilience and scalability.

Top Skills: BashDockerKubernetesLinuxPowershellPythonUnix

Bolt Graphics

Site Reliability Engineer

Reposted 3 Days AgoSaved

In-Office

Sunnyvale, CA, USA

145K-175K Annually

Senior level

145K-175K Annually

Senior level

Hardware • Semiconductor • Manufacturing

The Site Reliability Engineer will design, implement, and manage reliable infrastructure and services, ensuring operational excellence and uptime.

Top Skills: AWSBashDockerGrafanaKubernetesLinuxAzureOpenshiftPrometheusProxmoxPythonVmware Vsphere

Fabric Health

Site Reliability Engineer

Reposted 3 Days AgoSaved

In-Office or Remote

New York City, NY, USA

135K-160K Annually

Senior level

135K-160K Annually

Senior level

Artificial Intelligence • Healthtech • Software • Telehealth

Own and evolve Fabric's AWS/EKS infrastructure, build Terraform-managed infrastructure, improve observability with Datadog, lead incident response and SLOs, automate operations with AI/agentic workflows, optimize AWS resources, and ensure HIPAA-compliant, high-availability platform architecture while mentoring engineers.

Top Skills: Agentic WorkflowsAi-Assisted ToolingAWSBashDatadogEc2EksGithub ActionsGoKubernetesPythonRdsRubyS3SemaphoreTerraform

Comtech

Site Reliability Engineer

Reposted 3 Days AgoSaved

In-Office

Seattle, WA, USA

Senior level

Other

In this role, you will manage day-to-day operations of Internet-based enterprise systems, identify operational issues, develop tools for maintenance, and collaborate on infrastructure documentation and project execution.

Top Skills: .NetAnsibleApacheAzureChefIisJbossPerlPowershellPuppetPythonRubyTomcat

Comtech

Site Reliability Engineer

Reposted 3 Days AgoSaved

In-Office

Seattle, WA, USA

Senior level

Other

Responsible for monitoring, provisioning, and customer interactions, with a focus on maintaining high availability in complex web environments.

Top Skills: .NetAnsibleApacheCfengineChefDyanatraceGoIisJavaJbossNasNew RelicPerlPowershellPuppetPythonRaidRubySanSplunkSumo LogicTomcatWindows

GM Financial

Site Reliability Engineer I

Reposted 3 Days AgoSaved

Hybrid

Arlington, TX, USA

Mid level

Fintech • Financial Services

The Site Reliability Engineer I will support cloud infrastructure and assist in cloud transformation initiatives, focusing on performance and delivery of public cloud solutions, primarily in Azure. Responsibilities include troubleshooting, monitoring, automation, and contributing to operational readiness practices for cloud services.

Top Skills: .NetAnsibleAWSAzureAzure CliGCPJenkinsKubernetesLinuxPowershellTerraformWindows

GM Financial

Site Reliability Engineer II

Reposted 3 Days AgoSaved

Hybrid

2 Locations

Senior level

Fintech • Financial Services

The role involves shaping release engineering practices, implementing AI-driven solutions, and ensuring software reliability through collaboration and automation.

Top Skills: Ai-Powered ToolsAzureBashC#Github CopilotJavaPowershell

Equifax Inc.

Site Reliability Engineer

Reposted 3 Days AgoSaved

In-Office

2 Locations

60K-90K Annually

Senior level

60K-90K Annually

Senior level

Fintech • Consulting

The Site Reliability Engineer at Equifax manages system uptime, builds infrastructure as code, develops CI/CD pipelines, automates deployment, solves complex issues, and leads postmortems for system reliability.

Top Skills: AnsibleAWSBashChefDockerGCPGoJavaJavaScriptJenkinsKubernetesNode.jsPythonTerraform

Twenty

Forward Deployed Site Reliability Engineer (TS/SCI Required)

Reposted 3 Days AgoSaved

In-Office

Arlington, VA, USA

Senior level

Artificial Intelligence • Information Technology • Cybersecurity • Defense

As a Site Reliability Engineer, you'll ensure system reliability in a government environment, manage incidents, and collaborate with engineering teams on operational tasks and improvements while maintaining security compliance.

Top Skills: AWSBashDockerDocker ComposeGrafanaLinux/UnixLokiMimirPrometheusPythonTerraform

Basata

Site Reliability Engineer

Reposted 3 Days AgoSaved

In-Office

Tempe, AZ, USA

Senior level

Artificial Intelligence • Healthtech • Software • Automation

Design and own platform reliability: define SLOs, build observability, lead incident response and postmortems, evolve IaC and deployment pipelines, automate toil, and collaborate with engineers to improve operability and architecture for scaling.

Top Skills: AlertingCloud InfrastructureContainerized ServicesDeployment PipelineIncident ResponseInfrastructure-As-CodeJavaMonitoringObservabilityPythonSlosTypescript

Xometry

Staff Site Reliability Engineer (SRE)

Reposted 3 Days AgoSaved

In-Office

Waltham, MA, USA

135K-165K Annually

Mid level

135K-165K Annually

Mid level

Artificial Intelligence

The Site Reliability Engineer II will enhance infrastructure and software reliability, write efficient code, collaborate across teams, and maintain platforms and monitoring tools.

Top Skills: AWSCi/CdCoralogixDockerJavaScriptKubernetesPythonSentryTerraformUnix Shell