Job Title, Company or Keyword

Maximum of 25 job preferences reached.

Top Site Reliability Engineer Jobs

Xometry

Staff Site Reliability Engineer (SRE)

Reposted 3 Days AgoSaved

In-Office

North Bethesda, MD, USA

135K-165K Annually

Mid level

135K-165K Annually

Mid level

Artificial Intelligence

In this role, the Site Reliability Engineer will improve reliability and performance of infrastructure, write clean code, collaborate across teams, and maintain platforms for deployed software.

Top Skills: AWSCi/CdDockerJavaScriptKubernetesPythonTerraformUnix Shell

MLabs

SRE (Terminal)

Reposted 3 Days AgoSaved

In-Office

2 Locations

Senior level

Artificial Intelligence • Blockchain • Information Technology • Consulting

Design, scale, and operate multi-region, high-availability cloud infrastructure; lead incident response and on-call rotations; build automation and tooling in Python/Go; enforce risk, security, and operational standards; mentor teams and drive infrastructure architecture decisions.

Top Skills: AWSCryptoGCPGoIamIsoKafkaKubernetesOpentofuPostgresPythonRedpandaSoc2TerraformWeb3

Cerebras Systems

Site Reliability Engineer - Ops & Automation

Reposted 3 Days AgoSaved

In-Office

2 Locations

Mid level

Artificial Intelligence

The Deployment Engineer will manage AI inference clusters, optimizing deployment, capacity allocation, and ensuring reliability of pipeline operations across datacenters.

Top Skills: DockerGrafanaInfluxdbK8SLinuxPrometheusPython

HappyRobot

Site Reliability Engineer

Reposted 3 Days AgoSaved

Hybrid

San Francisco, CA, USA

200K-240K Annually

Mid level

200K-240K Annually

Mid level

Artificial Intelligence • Logistics • Software

The Site Reliability Engineer will enhance operational resilience, ensuring system stability, observability, and debugging workflows for complex failures while improving developer focus and uptime.

Top Skills: DatadogGoPrometheusPythonSentry

Obsidian Security

Sr. Staff Site Reliability Engineer

Reposted 3 Days AgoSaved

In-Office

Palo Alto, CA, USA

232K-263K Annually

Senior level

232K-263K Annually

Senior level

Cybersecurity

As a Sr. Staff Site Reliability Engineer, you will define the reliability vision for a multi-tenant SaaS platform, lead the architecture of detection systems, and partner across teams to improve incident management and system resilience, ensuring issues are resolved before affecting customers.

Top Skills: ArgocdAWSGCPGitlab Ci/CdGrafanaHelmKubernetesPrometheus

Genuine Parts Company

Site Reliability Engineer III

Reposted 3 Days AgoSaved

In-Office

Birmingham, AL, USA

Senior level

Automotive • Hardware • Logistics

The Site Reliability Engineer III enhances system reliability by building automation and supporting large-scale systems, ensuring critical platforms function optimally.

Top Skills: APIsAzure DevopsDynatraceGoogle Cloud PlatformGrafanaHTTPJavaKubernetesMicroservicesPrometheusTerraform

Canonical

Site Reliability Engineer

Reposted 3 Days AgoSaved

In-Office or Remote

7 Locations

200K-200K Annually

Mid level

200K-200K Annually

Mid level

Cloud • Software

The Site Reliability Engineer will ensure reliable cloud operations by applying Python for infrastructure automation, managing OpenStack and Kubernetes, and practicing devsecops in a fast-paced environment.

Top Skills: KubernetesLinuxOpenstackPython

CardWorks

Lead Site Reliability Engineer

Reposted 3 Days AgoSaved

In-Office

Pittsburgh, PA, USA

146K-162K Annually

Senior level

146K-162K Annually

Senior level

Financial Services

The Lead Site Reliability Engineer will establish the SRE operating model, implement AI-enabled reliability use cases, manage reliability metrics, and oversee operational readiness while collaborating with teams and mentoring engineers.

Top Skills: Ai/MlAnsibleAzure DevopsDockerGithub ActionsGitlab CiJenkinsKubernetesTerraformVMware

Sierra

Software Engineer, Site Reliability (SRE)

Reposted 3 Days AgoSaved

In-Office

San Francisco, CA, USA

230K-390K Annually

Senior level

230K-390K Annually

Senior level

Artificial Intelligence • Software

As a Software Engineer on the Site Reliability team, you'll ensure system reliability, scalability, and observability while partnering with engineering teams and improving incident management processes.

Top Skills: AWSCi/Cd ToolingContainer OrchestrationDatadogGrafanaPrometheusTerraform

Intelliswift

Site Reliability Engineer

Reposted 3 Days AgoSaved

In-Office

San Francisco, CA, USA

Mid level

Information Technology • Software • Big Data Analytics

The Site Reliability Engineer will design, analyze, and troubleshoot large-scale distributed systems, focusing on operating systems and performance tuning.

Top Skills: ApacheJava

Premera Blue Cross

Site Reliability Engineer IV

Reposted 3 Days AgoSaved

In-Office

Mountlake Terrace, WA, USA

136K-231K Annually

Senior level

136K-231K Annually

Senior level

Insurance • Financial Services

Drive reliability and operational excellence across cloud, on-premise, and hybrid platforms. Build automation and AI-powered tooling, design observability and self-healing systems, standardize CI/CD and incident practices, lead post-incident reviews, and support production systems through on-call rotation while advising engineering teams on reliability, compliance, and modern DevOps practices.

Top Skills: Ai PlatformsC#Ci/CdCloudContainer PlatformsDockerEvent StreamingInfrastructure-As-CodeJavaJavaScriptKubernetesLlmsObservabilityPowershellPythonTelemetry

CoorsTek

Software Site Reliability Engineer

Reposted 3 Days AgoSaved

In-Office

Golden, CO, USA

103K-136K Annually

Senior level

103K-136K Annually

Senior level

Manufacturing

The Site Reliability Engineer will ensure the reliability, security, and support of Databricks applications while collaborating with various teams to optimize data workflows and incident management.

Top Skills: AzureCi/CdDatabricksDelta LakePysparkPythonSQLUnity Catalog

New

Cut your apply time in half.

Use ourAI Assistantto automatically fill your job applications.

Use For Free

Ministry Brands

Sr. Director, Platform Engineering & SRE

Reposted 3 Days AgoSaved

In-Office

Alpharetta, GA, USA

Senior level

Information Technology • Software

As Sr. Director, Platform Engineering & SRE, you will lead the reliability and operational excellence of the platform, establishing practices for site reliability engineering and managing cloud engineering across a multi-cloud SaaS portfolio.

Top Skills: AWSAzureCi/CdDatadogGCPGrafanaInfrastructure-As-CodeOpentelemetryPrometheus

Harvey

Staff Software Engineer, Site Reliability Engineer

Reposted 3 Days AgoSaved

In-Office

San Francisco, CA, USA

238K-290K Annually

Expert/Leader

238K-290K Annually

Expert/Leader

Artificial Intelligence • Legal Tech • Professional Services • Software

As a Staff Software Engineer in Site Reliability, you'll manage infrastructure for reliability and scalability, lead incident management, and automate operational tasks.

Top Skills: AWSAzureBashCloudFormationDatadogGCPGoIncidentioPagerdutyPulumiPythonSentryTerraform

Harvey

Senior Software Engineer, Site Reliability Engineer

Reposted 3 Days AgoSaved

In-Office

San Francisco, CA, USA

200K-260K Annually

Mid level

200K-260K Annually

Mid level

Artificial Intelligence • Legal Tech • Professional Services • Software

As a Software Engineer in Site Reliability, you will ensure the reliability and performance of our AI platform through automation and strategic infrastructure management.

Top Skills: AWSAzureBashCloudFormationDatadogGCPGoKubernetesPagerdutyPythonSentryTerraform

Akamai Technologies

Site Reliability Engineer II

Reposted 3 Days AgoSaved

In-Office or Remote

2 Locations

95K-171K Annually

Junior

95K-171K Annually

Junior

Cloud • Security • Software • Cybersecurity

As a Site Reliability Engineer II, you'll automate tasks, monitor AI workloads, enhance dashboards, support CI/CD processes, and collaborate with engineering teams on complex issues while participating in on-call rotations.

Top Skills: GoGrafanaKubernetesLinuxPrometheusPythonSaltstackTerraform

PostHog

Site Reliability Engineer (Pacific timezone)

Reposted 3 Days AgoSaved

Remote

USA

Mid level

Software • Analytics

The role involves automating and managing AWS infrastructure, ensuring reliability and scalability of stateful systems, and optimizing deployment processes. You'll also handle incident responses and improve operational tooling.

Top Skills: AWSKubernetesTerraformTerragrunt

Focused

Staff SRE - Observability

Reposted 3 Days AgoSaved

In-Office

Denver, CO, USA

160K-200K Annually

Mid level

160K-200K Annually

Mid level

Artificial Intelligence • Cloud • Information Technology • Mobile • Software • Consulting

The role involves designing and implementing OpenTelemetry solutions, optimizing observability infrastructure, and supporting SRE practices and cloud deployments.

Top Skills: AWSAzureCloudFormationDockerGCPGoJavaKubernetesNode.jsOpentelemetryPulumiPythonRustTerraform

Cognition

Site Reliability Engineer

Reposted 3 Days AgoSaved

In-Office

San Francisco, CA, USA

Senior level

Artificial Intelligence • Software

The Site Reliability Engineer ensures the reliability and performance of products Devin and Windsurf, managing incident response, CI/CD pipelines, infrastructure as code, and fostering a reliability culture within the engineering team.

Top Skills: AWSAzureCi/CdGCPKubernetesTerraform

Yugabyte

Staff Site Reliability Engineer

Reposted 3 Days AgoSaved

Remote

United States

220K-250K Annually

Expert/Leader

220K-250K Annually

Expert/Leader

Cloud • Software • Database

Lead design, build, and operate the YugabyteDB DBaaS infrastructure. Drive architecture, automate lifecycle and maintenance, manage incidents and on-call rotations, implement security/encryption processes, and optimize reliability using SRE principles and observability.

Top Skills: AksAnsibleAWSAzureBashDockerEksGCPGitGithub ActionsGkeJavaKubernetesLinuxPostgresPrometheusPythonShellTerraform

Elastic

Site Reliability Engineer (Hosted Infra) - Platform

Reposted 3 Days AgoSaved

Remote

United States

133K-211K Annually

Mid level

133K-211K Annually

Mid level

Cloud • Security • Software • Generative AI

Design, build, and automate large-scale multi-cloud infrastructure and internal SRE tools. Improve host lifecycle, observability, alerting, and reliability; operate containerized workloads; participate in on-call rotations, incident response, runbooks, postmortems, code reviews, and mentoring.

Top Skills: AnsibleArgo CdArgo WorkflowsCueDockerElastic StackGoGraphiteInfluxKubernetesLinuxPrometheusPuppetTerraformUbuntuUbuntu Live Patch

Stellar Cyber

Senior DevOps Engineer/Site Reliability Engineer-East Coast

Reposted 3 Days AgoSaved

In-Office or Remote

2 Locations

165K-215K Annually

Senior level

165K-215K Annually

Senior level

Software • Cybersecurity

This role involves managing Kubernetes clusters, cloud infrastructure, and CI/CD pipelines. The engineer will enhance system reliability and efficiency while troubleshooting production issues.

Top Skills: AlertmanagerAWSAzureBashCi/CdDockerElastic StackElasticsearchGCPGoGrafanaHelmKafkaKubernetesLokiMongoDBOciPrometheusPythonRedisSparkTerraform

VERISIGN

SRE - Linux

Reposted 3 Days AgoSaved

In-Office

Reston, VA, USA

136K-184K Annually

Senior level

136K-184K Annually

Senior level

Information Technology • Software

The Systems Engineer manages Linux systems, designs CI/CD pipelines, administers application security platforms, and ensures compliance with security standards.

Top Skills: AnsibleBashCloudbees JenkinsDockerElkGitGithub ActionsGithub Advanced SecurityJfrog ArtifactoryJfrog XrayJIRAKubernetesLinuxNagiosNexus IqPrometheusPythonTerraform

Okta

Staff Site Reliability Engineer, Kubernetes w/ active TS/SCI

Reposted 3 Days AgoSaved

In-Office

Washington, DC, USA

188K-259K Annually

Senior level

188K-259K Annually

Senior level

Cloud

The Staff Site Reliability Engineer will lead the design of AWS solutions, manage incident responses, and mentor junior engineers, ensuring reliability and security in federal environments.

Top Skills: AWSDatabricksGoHelmKubernetesRedshiftSnowflakeTerraform

xAI

Site Reliability Engineer - Cybersecurity

Reposted 3 Days AgoSaved

In-Office

Palo Alto, CA, USA

180K-360K Annually

Mid level

180K-360K Annually

Mid level

Information Technology

The role involves securing and maintaining the reliability of X Money's infrastructure, focusing on AWS, Kubernetes, and code security while implementing best practices and collaborative problem-solving.

Top Skills: AWSDynamoDBKubernetesPythonRdsTerraform