Job Title, Company or Keyword

Maximum of 25 job preferences reached.

Top Site Reliability Engineer Jobs

Castleton Commodities International

Senior Site Reliability Engineer

Reposted 15 Days AgoSaved

In-Office

2 Locations

Senior level

Energy

The Senior Site Reliability Engineer improves infrastructure reliability and scalability, partners with various teams, implements IaC and CI/CD, and ensures business continuity through effective BCP/DR planning.

Top Skills: AWSBashCloudFormationDatadogElkGithub ActionsGitlab CiGoGrafanaJenkinsKubernetesOpensearchPrometheusPythonTerraform

Anyscale

Senior Site Reliability Engineer, Platform Infrastructure (Foundations)

16 Days AgoSaved

Hybrid

2 Locations

Senior level

Artificial Intelligence • Software

Design, build, and scale control- and data-plane infrastructure for distributed AI workloads. Improve reliability, performance, scheduling, and observability for Ray clusters across cloud and on-prem environments. Support accelerator integration, container image management, and provide on-call troubleshooting and cross-team collaboration.

Top Skills: AWSAzureContainersGCPGoGpusGrafanaKubernetesLinuxPrometheusPythonRayTpusVms

Kody

Senior Site Reliability Engineer- Sunnyvale, CA, the US

16 Days AgoSaved

In-Office

Sunnyvale, CA, USA

Senior level

Fintech • Payments • Software • Financial Services

Senior SRE responsible for ensuring platform scalability and reliability on AWS, owning CI/CD and GitHub workflows, leading incident response and post-mortems, implementing observability and logging, and serving as a bilingual (Mandarin/English) technical liaison with international engineering teams.

Top Skills: AWSCi/CdGitLoggingMonitoringScripting

Renaissance Learning

Sr Site Reliability Engineer

17 Days AgoSaved

Remote

110K-151K Annually

Senior level

110K-151K Annually

Senior level

Edtech

Lead SRE work to improve availability, reliability, observability, and security for a distributed SaaS platform. Build and maintain IaC (Terraform, CloudFormation), support CI/CD, manage containerized production environments (Kubernetes/EKS), run disaster recovery exercises, participate in on-call rotation, collaborate cross-functionally, and mentor teams while integrating tooling including AI into SRE workflows.

Top Skills: .NetAnsibleAws EksCi/CdCloudFormationDockerJavaJavaScriptKubernetesPythonTerraform

Blitzy

Senior Site Reliability Engineer

17 Days AgoSaved

In-Office

Cambridge, MA, USA

160K-180K Annually

Senior level

160K-180K Annually

Senior level

Artificial Intelligence • Software • Generative AI • Automation

Lead design, build, and operation of scalable, fault-tolerant cloud infrastructure. Define SLOs/SLAs, improve observability and incident response, own CI/CD and deployment automation, partner with engineering teams on reliability, capacity planning, performance benchmarking, cost optimization, and security for an AI platform.

Top Skills: AWSAzureBashCi/CdDatadogEbpfGCPGoGpuGrafanaIstioKubernetesLinkerdOpentelemetryPrometheusPulumiPythonTerraform

Metabase

Senior SRE/DevOps Engineer

Reposted 17 Days AgoSaved

Remote

United States

Senior level

Big Data

You will manage AWS infrastructure, automate deployments, debug application issues, and improve the operational health of Metabase Cloud.

Top Skills: AWSDatadogGoGrafanaKubernetesPrometheusPythonTerraform

Life.Church

Senior Site Reliability Engineer

Reposted 18 Days AgoSaved

In-Office

Edmond, OK, USA

Senior level

Other

The Senior Site Reliability Engineer ensures the integrity, performance, and reliability of cloud infrastructure, overseeing software development, maintenance, and site reliability issues while promoting industry best practices.

Top Skills: Cloud InfrastructureDevOpsSoftware Development

Heidi Health

Senior Site Reliability Engineer (Upmarket)

Reposted 19 Days AgoSaved

In-Office

San Francisco, CA, USA

140K-185K Annually

Mid level

140K-185K Annually

Mid level

Artificial Intelligence • Healthtech

The role involves improving operational reliability, managing production environments, enhancing observability, automating tasks, and collaborating with engineering teams, requiring 3-6 years of relevant experience.

Top Skills: AWSBashDatadogKubernetesPrometheusPythonTerraform

LiveRamp

Senior SRE

Reposted 19 Days AgoSaved

In-Office

San Francisco, CA, USA

127K-192K Annually

Senior level

127K-192K Annually

Senior level

Big Data • Cloud • Marketing Tech • Social Impact • Software

As a Senior Site Reliability Engineer, you will support product deployments, provide engineering support, maintain systems, and collaborate with teams globally to enhance infrastructure reliability.

Top Skills: AWSCassandraCircleCIDynamoDBGCPGoJenkinsKubernetesNosql DatabasesPythonScylladbSinglestore DbTerraform

Las Vegas Sands

Senior Site Reliability Engineer

Reposted 19 Days AgoSaved

In-Office

Dallas, TX, USA

Senior level

Healthtech • Travel

The Senior Site Reliability Engineer leads reliability engineering for Azure, focusing on scripting, automation, observability, and incident response, ensuring service quality and uptime.

Top Skills: AksApp ServicesApplication InsightsAzureAzure DevopsAzure MonitorBicepFunctionsGithub ActionsGrafanaItrs GeneosJIRALog AnalyticsPowershellPythonServicenowTerraformVm Scale Sets

Luma AI

Senior Site Reliability Engineer

Reposted 20 Days AgoSaved

In-Office or Remote

9 Locations

170K-290K Annually

Expert/Leader

170K-290K Annually

Expert/Leader

Artificial Intelligence • Software

As a Software Engineer in Reliability, you'll architect and manage multi-cloud GPU infrastructure, ensuring performance, security, and scale while debugging complex hardware/software issues.

Top Skills: AmdAWSBashGoGpuInfinibandLinuxNvidiaOciPythonRdma

Filevine

Senior Site Reliability Engineer - GCP

Reposted 20 Days AgoSaved

Remote

United States

Expert/Leader

Legal Tech • Software

As a Site Reliability Engineer, you'll develop autonomous systems, improve CI/CD pipelines, mentor junior engineers, and ensure software reliability and security in a 24/7 environment.

Top Skills: BashPowershellPython

New

Track Smarter, Apply Better.

Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.

Use For Free

MetroStar

Sr. Site Reliability Engineer III (6572)

Reposted 21 Days AgoSaved

In-Office

Washington, DC, USA

170K-220K Annually

Senior level

170K-220K Annually

Senior level

Information Technology • Consulting

As a Sr. Site Reliability Engineer, you'll design, deploy, and maintain applications in virtualized environments, develop CI/CD pipelines, and ensure operational observability and performance of production systems.

Top Skills: AnsibleBashF5Gitlab Ci/CdKubernetesMinioPortworxS3-Compatible ServicesVMware

CodeRabbit

Senior Site Reliability Engineer

Reposted 21 Days AgoSaved

Hybrid

San Francisco, CA, USA

250K-350K Annually

Senior level

250K-350K Annually

Senior level

Artificial Intelligence • Information Technology • Software

The Site Reliability Engineer will ensure high availability and performance of CodeRabbit's AI-powered code review platform, enhancing system reliability through infrastructure ownership, performance engineering, and automation.

Top Skills: AWSDatadogDockerElk StackGoogle Cloud PlatformGrafanaKubernetesLinuxNode.jsPrometheusTerraformTypescript

Okta

Senior Site Reliability Engineer, Kubernetes w/ active TS/SCI

Reposted 21 Days AgoSaved

In-Office

Washington, DC, USA

147K-202K Annually

Senior level

147K-202K Annually

Senior level

Cloud

The Staff Site Reliability Engineer will manage large-scale cloud production systems, ensuring reliability and performance, while automating processes and responding to incidents.

Top Skills: AWSBashCloudFormationDockerGoHelmKubernetesPythonRubyTerraform

Esri

Sr. Site Reliability Engineer - AWS Geospatial Technology

Reposted 21 Days AgoSaved

In-Office

Vienna, VA, USA

84K-142K Annually

Senior level

84K-142K Annually

Senior level

Other • Software • Analytics

The Sr. Site Reliability Engineer will manage SaaS capabilities, implement monitoring systems, automate operational tasks, and provide on-call support.

Top Skills: AWSBashDockerEksElkGitJavaKubernetesPrometheusPythonTerraform

Esri

Sr. Site Reliability Engineer - AWS Geospatial Technology

Reposted 21 Days AgoSaved

In-Office

Charlotte, NC, USA

84K-142K Annually

Senior level

84K-142K Annually

Senior level

Other • Software • Analytics

The role involves deploying and managing SaaS solutions, automating infrastructure processes, troubleshooting system issues, and collaborating with a team of engineers.

Top Skills: Arcgis VelocityArcgis Workflow ManagerAWSAws LambdaBashDockerEksElkGitKafkaKubernetesOpensearchPrometheusPythonTerraform

Esri

Sr. Site Reliability Engineer - AWS Geospatial Technology

Reposted 21 Days AgoSaved

In-Office

St. Louis, MO, USA

84K-142K Annually

Senior level

84K-142K Annually

Senior level

Other • Software • Analytics

As a Sr. Site Reliability Engineer, you will manage cloud-based SaaS products, automate infrastructure, troubleshoot issues, and provide technical support while collaborating with a team of engineers.

Top Skills: AWSAws LambdaBashDockerEcsEksElkGitJavaKafkaKubernetesOpensearchPrometheusPythonSecurity GroupsTerraformVpc

Esri

Sr. Site Reliability Engineer - AWS Geospatial Technology

Reposted 21 Days AgoSaved

In-Office

Redlands, CA, USA

84K-142K Annually

Senior level

84K-142K Annually

Senior level

Other • Software • Analytics

The role involves deploying and managing SaaS capabilities on AWS, including monitoring systems, automation solutions, and troubleshooting incidents. Collaboration with SRE engineers is key to operational success across multiple regions.

Top Skills: AWSAws LambdaBashDockerEcsElkGitGitKafkaKubernetesOpensearchPrometheusPythonTerraform

DoubleVerify

Sr. Site Reliability Engineer I

Reposted 21 Days AgoSaved

In-Office

New York, NY, USA

89K-178K Annually

Senior level

89K-178K Annually

Senior level

AdTech • Marketing Tech

The role involves enhancing the reliability and performance of media measurement platforms, managing incidents, implementing observability practices, automating processes, and ensuring high availability of cloud and on-premises infrastructures.

Top Skills: AnsibleAWSBashGCPGitlabGoGrafanaHelmKubernetesLinuxMongoDBNagiosNoSQLOciPrometheusPythonSnowflakeSplunkSQLTerraformUnixVertica

iHeartMedia

Senior Site Reliability Engineer

23 Days AgoSaved

In-Office

Oakland Estates, San Antonio, TX, USA

Senior level

Digital Media • Events • Music

Lead and manage a team of SRE/DevOps engineers to ensure reliability, availability, and performance of cloud-based systems. Oversee incident response, operational troubleshooting, process improvements, and cross-team collaboration while mentoring and delegating tasks to meet business objectives.

Top Skills: Cloud Services

Arx Talent

Senior Site Reliability Engineer

23 Days AgoSaved

Hybrid

San Francisco, CA, USA

165K-235K Annually

Senior level

165K-235K Annually

Senior level

Artificial Intelligence • HR Tech • Professional Services

Design, build, and operate scalable, reliable cloud infrastructure. Maintain AWS/GCP and Linux systems, Kubernetes clusters, CI/CD pipelines, and monitoring (Prometheus/ELK). Automate operations, troubleshoot production issues, run on-call, conduct reviews, and evaluate new technologies to improve availability and performance.

Top Skills: AnsibleAWSCi/CdElkGCPJenkinsKubernetesLinuxPrometheusPuppetTerraform

Arx Talent

Senior Site Reliability Engineer (Copy)

23 Days AgoSaved

Hybrid

San Francisco, CA, USA

205K-225K Annually

Senior level

205K-225K Annually

Senior level

Artificial Intelligence • HR Tech • Professional Services

Design, build, and operate reliable, scalable cloud infrastructure. Maintain AWS/GCP and Linux systems, manage Kubernetes clusters, implement IaC (Ansible/Puppet/Terraform), automate CI/CD (Jenkins), monitor with Prometheus/ELK, triage alerts, participate in design/reviews, migrate apps to Kubernetes, and improve operational automation.

Top Skills: AnsibleAWSC++ElkGCPGoJenkinsKubernetesLinuxPrometheusPuppetRustTerraformTypescript

IDEXX

Senior Site Reliability Engineer

23 Days AgoSaved

In-Office or Remote

3 Locations

100K-125K Annually

Senior level

100K-125K Annually

Senior level

Healthtech • Pet • Biotech

Senior SRE responsible for designing and modernizing CI/CD and deployment systems, automating AWS Serverless infrastructure, improving observability and incident response, enforcing release and security practices, and guiding engineering teams to scale resilient global services.

Top Skills: AuroradbAws CloudformationAws LambdaAzure Entra IdCloudfrontDynamoDBEventbridgeGitGitGithub ActionsMavenOauth2Openid ConnectS3SnsSqsTerraform

Illumio

Sr. Site Reliability Engineer

23 Days AgoSaved

In-Office

Sunnyvale, CA, USA

170K-196K Annually

Senior level

170K-196K Annually

Senior level

Software • Cybersecurity

Drive reliability, scalability, and performance of cloud-based systems on AWS/Azure. Monitor systems, handle on-call production support, lead incident response and root cause analysis, perform releases and hotfixes, implement cloud security controls, and automate infrastructure improvements.

Top Skills: AWSAzureAzure DevopsCloud-NativeDockerGitlab Ci/CdGoJenkinsKubernetesMicroservicesPowershellPython