Job Title, Company or Keyword

Maximum of 25 job preferences reached.

Top Site Reliability Engineer Jobs

Genuine Parts Company

Site Reliability Engineer II

10 Days AgoSaved

In-Office

Birmingham, AL, USA

Mid level

Automotive • Hardware • Logistics

Builds and supports large-scale, distributed, fault-tolerant systems to improve reliability and automation. Administers networks and databases, monitors system health, configures load and data communications, coordinates equipment and vendor orders, and participates in change management to reduce incidents and support cloud transformations.

Top Skills: CloudDistributed ComputingInternet SecurityMonitoring ToolsOracle ErpUnixVersion Control SystemsWindows 2000Windows 98Windows Nt

Booz Allen Hamilton

Site Reliability Engineer

10 Days AgoSaved

In-Office

McLean, VA, USA

87K-198K Annually

Senior level

87K-198K Annually

Senior level

Information Technology

Design and build resilient infrastructure, implement monitoring and SLIs/SLOs, automate operations and self-healing, reduce toil with scripting, support enterprise-scale application reliability, act as subject matter expert for engineering teams, and meet government vetting and U.S. citizenship requirements.

Top Skills: AWSCi/CdCloud-NativeCloudtrailCloudwatchGitGithub ActionsGitlab RunnersItsiJenkinsLinuxMicroservicesPaasPagerdutySaaSSplunkUnix

Skyward IT Solutions, LLC

Site Reliability Engineer

10 Days AgoSaved

Hybrid

Rockville, MD, USA

112K-150K Annually

Mid level

112K-150K Annually

Mid level

Artificial Intelligence • Cloud • Software • Cybersecurity

Operate and tune AWS environments to meet SLAs, build observability and alerts, automate infrastructure with IaC and CI/CD, define SLIs/SLOs, support security/compliance within a FISMA Moderate boundary, design resilience and DR plans, and own incident response and post-mortems.

Top Skills: AnsibleAWSAws CloudwatchAws Trusted AdvisorCi/CdCloudFormationDockerGitlab CiJenkinsNew RelicPythonSplunkTerraform

Fidelity Investments

Site Reliability Engineer

Reposted 10 Days AgoSaved

In-Office

3 Locations

Junior

Fintech

Build production-quality software to improve reliability, reduce operational toil, and scale systems. Own end-to-end features, participate in on-call rotations, analyze incidents, implement observability, and build automations using Node.js/TypeScript, Python, and AI-assisted tools.

Top Skills: Ai-Assisted Development ToolsAWSAzureCi/CdGithub CopilotJavaScriptNode.jsPowershellPythonSQLTypescriptVs Code

Solvd, Inc.

Infrastructure / Site Reliability Engineer (SRE)

Reposted 10 Days AgoSaved

Remote

6 Locations

Mid level

Information Technology • Software • Consulting

Join Solvd as an Infrastructure/SRE Engineer to design, manage cloud infrastructure, build CI/CD pipelines, automate deployments, and ensure system reliability through observability and performance tuning.

Top Skills: ArgocdAWSAzureBashDatadogDockerFluxGCPGithub ActionsGitlab CiGoGrafanaJenkinsKubernetesMemcachedNew RelicOpentofuPostgresPrometheusPythonRdsRedisTerraform

FloSports

Staff Site Reliability Engineer

Reposted 10 Days AgoSaved

Remote

United States

Senior level

Digital Media • Social Media • Software • Sports

Lead the technical architecture and execution of migration to AWS, drive developer enablement, and automate infrastructure using code-first principles.

Top Skills: Aws EksDatadogGithub ActionsGoIstioK6KubernetesNode.jsTerraform

eMed Digital Healthcare

Senior Software Engineer (SRE)

Reposted 10 Days AgoSaved

In-Office

Miami, FL, USA

Senior level

Healthtech

The Senior Software Engineer will enhance system reliability, manage Kubernetes and AWS environments, oversee incident responses, and implement observability measures.

Top Skills: AWSCloudwatchElbGithub ActionsKubernetesObservability ToolingTerraformVpc

LSEG (London Stock Exchange Group)

Senior Engineer, Site Reliability Engineer

Reposted 10 Days AgoSaved

In-Office

St. Louis, MO, USA

100K-120K Annually

Senior level

100K-120K Annually

Senior level

Fintech • Analytics

As a Senior Site Reliability Engineer, you'll lead incident recovery, enhance production stability, automate processes, and collaborate with development teams to improve operational efficiency.

Top Skills: AWSAzureBigpandaCloud-Native ApplicationsDatadogDnsDockerGitHTTPKubernetesShell ScriptingTcp/IpUnix

LSEG (London Stock Exchange Group)

Senior Engineer, SRE

Reposted 10 Days AgoSaved

In-Office

St. Louis, MO, USA

Senior level

Fintech • Analytics

The Site Reliability Engineer will support and automate critical Real Time applications, ensuring service availability and quality across cloud and on-premise deployments, while also collaborating with various teams on operational documentation and incident management.

Top Skills: AWSAzureDatadogDockerGitKubernetesPythonUnix/Linux

SingleStore

Site Reliability Engineer

Reposted 10 Days AgoSaved

In-Office

Seattle, WA, USA

Senior level

Cloud • Software • Database

The Site Reliability Engineer will optimize and scale managed services across cloud providers, automate infrastructure, enhance monitoring, and ensure system reliability.

Top Skills: AWSAzureBashGCPGrafanaKubernetesLokiMimirPrometheusPython

Ironclad

Senior Staff Site Reliability Engineer

Reposted 10 Days AgoSaved

Hybrid

3 Locations

245K-270K Annually

Senior level

245K-270K Annually

Senior level

Information Technology • Consulting

As a Senior Staff Site Reliability Engineer, you will lead the SRE team, advocate best practices, ensure resilience in cloud architecture, and mentor team members.

Top Skills: ArgocdCircleCIGoogle Cloud PlatformKubernetesPulumiTerraformTypescript

Ditto

Site Reliability Engineer

Reposted 10 Days AgoSaved

Remote

USA

156K-288K Annually

Mid level

156K-288K Annually

Mid level

Computer Vision • Machine Learning • Software

As a Site Reliability Engineer, ensure the reliability, performance, and scalability of Ditto's cloud infrastructure by developing observability solutions, leading incident management, and collaborating with product engineering teams.

Top Skills: AWSAzureCDatadogGCPGoGrafanaHelmJavaKubernetesPrometheusRustTerraform

New

Cut your apply time in half.

Use ourAI Assistantto automatically fill your job applications.

Use For Free

Mercor

Site Reliability Engineer

Reposted 10 Days AgoSaved

In-Office

San Francisco, CA, USA

Senior level

Artificial Intelligence • Software

As a Site Reliability Engineer at Mercor, you will ensure production reliability, develop SRE function, and collaborate with engineering teams to maintain system performance.

Top Skills: AWSKubernetesSpaceliftTerraform

Speakeasy

Platform Engineer (SRE) - AI Control Plane

Reposted 10 Days AgoSaved

In-Office

San Francisco, CA, USA

Mid level

Enterprise Web • Information Technology • Software

As a Platform Engineer, you will enhance reliability and performance, design operational processes, and build monitoring systems while collaborating with a talented team.

Top Skills: AIAssistantsBackendDeveloper ToolsFrontendInfrastructureMcpsMonitoring SystemsSkills

CME Group

Staff Site Reliability Engineer

Reposted 10 Days AgoSaved

In-Office

Wacker, IL, USA

132K-220K Annually

Expert/Leader

132K-220K Annually

Expert/Leader

Financial Services

The Staff Site Reliability Engineer will lead Platform Engineering's SRE efforts by defining technical strategy, overseeing architecture, and enhancing operational excellence through mentorship and governance.

Top Skills: ArgocdGCPGkeGoKafkaNode.jsPythonTerraform

Xona Space Systems

Site Reliability Engineer (SRE)

Reposted 10 Days AgoSaved

In-Office

Burlingame, CA, USA

170K-197K Annually

Mid level

170K-197K Annually

Mid level

Aerospace • Artificial Intelligence

The Site Reliability Engineer will architect and manage ground infrastructure for satellite systems, ensuring high availability, automating deployments, and optimizing data management systems.

Top Skills: AnsibleAWSAzureC++CloudFormationEksElkGCPGrafanaHelmKubernetesPrometheusPythonTerraform

Speakeasy

Platform Engineer (SRE) - AI Control Plane

Reposted 10 Days AgoSaved

In-Office

San Francisco, CA, USA

Mid level

Software

Join a passionate team to enhance reliability and performance of the AI control plane, manage deployments, and respond to production incidents while ensuring service quality for customers.

Top Skills: Ai Control PlaneDeveloper ToolsInfrastructure

Patterson-Uti Energy

Site Reliability Engineer NEX

Reposted 10 Days AgoSaved

In-Office

Houston, TX, USA

Mid level

Other • Energy

The Site Reliability Engineer will build and maintain reliable systems on Google Cloud Platform, automate operations, and improve system performance and reliability.

Top Skills: AirflowBigQueryCloud MonitoringDataflowDatastreamDockerGithub ActionsGitlab CiGoGoogle Cloud PlatformGrafanaIamJavaKubernetesPrometheusPythonTerraform

Analytic Partners

Site Reliability Engineer

Reposted 10 Days AgoSaved

Hybrid

3 Locations

100K-115K Annually

Mid level

100K-115K Annually

Mid level

AdTech • Big Data • Marketing Tech • Software

Responsible for owning and optimizing the Internal Developer Platform, improving reliability, scalability, and usability while supporting engineering teams and standardizing operational processes through automation and best practices.

Top Skills: ArmAWSAzureBashCloudFormationConsulDockerGithub ActionsHashicorpJenkinsKubernetesLinuxNomadPowershellPythonSplunkSumo LogicTerraformVaultWindows

Rainforest

Site Reliability Engineer

Reposted 10 Days AgoSaved

Hybrid

Atlanta, GA, USA

Mid level

Fintech • Payments • Financial Services

Build, operate, and scale AWS-based infrastructure using IaC (Terraform), manage EKS and serverless environments, create CI/CD pipelines, implement observability (OpenTelemetry/Prometheus/New Relic), support Postgres/RDS (Aurora), lead incident response and define SRE practices (SLIs/SLOs/error budgets).

Top Skills: AuroraAWSAws RdsAzureCloudFormationEcsEksGithub ActionsGitlabGoGCPJavaKubernetesNew RelicOpentelemetryOpentofuPostgresPrometheusPythonRubyServerlessTerraformTerragrunt

AlphaSense

Staff Site Reliability Engineer

11 Days AgoSaved

Remote or Hybrid

United States

150K-225K Annually

Senior level

150K-225K Annually

Senior level

Artificial Intelligence • Fintech • Machine Learning • Natural Language Processing • Business Intelligence

Lead architecture and implementation of reliability platforms and SRE practices for a production SaaS. Build self-service reliability tooling, drive AIOps automation, advance observability (monitoring, tracing, profiling), lead incident response and postmortems, mentor engineers, and embed production readiness across teams to achieve 99.99% uptime.

Top Skills: AWSAzureContinuous ProfilingDatadogDnsElkGCPGoGrafanaHttp/SKubernetesLoad BalancingOpentelemetryPrometheusPythonTcp/Ip

Accela

Site Reliability Engineer 2

11 Days AgoSaved

In-Office or Remote

Basel, KS, USA

125K-145K Annually

Mid level

125K-145K Annually

Mid level

Software

Operate and improve Accela's cloud-based SaaS platform to ensure availability, performance, security, and scalability. Build automation and tooling, monitor observability and SLOs, participate in incident response and RCA, support deployments and change management, and help maintain compliance for regulated environments.

Top Skills: AnsibleArgo CdBashClaude CodeFluxGitGitGithub CopilotKubernetesLinuxAzureOpentelemetryPowershellPythonTerraform

Procter & Gamble

Site Reliability Engineer - Band 1 - Digital Commerce

11 Days AgoSaved

In-Office or Remote

3 Locations

176K-176K Annually

Entry level

176K-176K Annually

Entry level

AdTech • Beauty • Marketing Tech • Retail • Pharmaceutical

Lead incident response and root cause analysis, maintain platform reliability and performance, implement and improve observability solutions, collaborate with vendor teams, and contribute to continuous improvement of incident management and operational processes.

Top Skills: DatabricksGrafanaPrometheusSpyglass

Relativity

Lead SRE Engineer

11 Days AgoSaved

Remote

Illinois, USA

150K-224K Annually

Senior level

150K-224K Annually

Senior level

Legal Tech • Software

Lead Site Reliability Engineer responsible for platform availability and reliability of RelativityOne. Drive SRE best practices, build tools, lead projects, coach SREs, work with stakeholders, support incidents, run postmortems, and improve monitoring, automation, and operational efficiency.

Top Skills: Ci/CdDevOpsJenkinsJIRAKubernetesAzureMonitoring And AlertingNew RelicNoSQLPowershellRelativity ServerRelativityoneSQLTableau

Vertafore

Sr. Site Reliability Engineer

Reposted 16 Days AgoSaved

Hybrid

Denver, CO, USA

110K-145K Annually

Senior level

110K-145K Annually

Senior level

Information Technology • Insurance • Software

Responsible for the reliability and performance of production services, managing SLIs and SLOs, and leading incident responses while collaborating with various teams.

Top Skills: .NetAWSC#Ci/CdJavaKubernetesLinuxPythonReactWindows