Job Title, Company or Keyword

Maximum of 25 job preferences reached.

Top Site Reliability Engineer Jobs

Coinbase

Staff Site Reliability Engineer, Core AI Infrastructure

24 Days AgoSaved

Easy Apply

Remote

USA

Easy Apply

218K-257K Annually

Senior level

218K-257K Annually

Senior level

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3

Own reliability, monitoring, and incident response for AI infrastructure; build automation and CI/CD tooling; manage Kubernetes/Docker production workloads; partner with infrastructure, security, and compliance; improve observability and documentation; develop internal full‑stack tooling in Go or Python.

Top Skills: AnsibleAWSBashChefCi/CdDockerEc2GitGoKubernetesLinuxLog AggregationNetwork SecurityPuppetPythonRubySaltTerraform

Mastercard

Lead Site Reliability Engineer

Reposted 24 Days AgoSaved

Hybrid

O'Fallon, MO, USA

122K-207K Annually

Senior level

122K-207K Annually

Senior level

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing

Lead Site Reliability Engineer responsible for platform stability, mentoring, and improving application performance through automation, configuration management, and operational readiness.

Top Skills: GoJavaPythonSpring Framework

Relativity Space

Site Reliability Engineer

25 Days AgoSaved

Easy Apply

In-Office

Long Beach, CA, USA

Easy Apply

140K-214K Annually

Senior level

140K-214K Annually

Senior level

Aerospace • Hardware • Robotics • Software • Manufacturing

Design, implement, and maintain a scalable SRE/DevOps platform across cloud and on-prem sites. Ensure uptime, automate deployments with IaC, define SLOs, leverage configuration management, and partner with development and manufacturing teams to increase automation and reliability.

Top Skills: GCPHelmKubernetesTerraform

Legora

Senior Site Reliability Engineer

Reposted 2 Days AgoSaved

In-Office

New York City, NY, USA

237K-369K Annually

Senior level

237K-369K Annually

Senior level

Artificial Intelligence • Legal Tech • Software

As a Senior Site Reliability Engineer, you'll operate foundational platform services, enhance reliability standards, automate processes, and work with engineering teams to improve systems.

Top Skills: Cloud InfrastructureKubernetesObservability Tools

Mastercard

Senior Site Reliability Engineer

Reposted 2 Days AgoSaved

Remote or Hybrid

Salt Lake City, UT, USA

96K-163K Annually

Senior level

96K-163K Annually

Senior level

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing

Lead reliability, scalability, and production operations for a greenfield enterprise application. Influence design for production readiness, own incident response, define SLIs/SLOs, build observability and automation, enhance CI/CD, and improve developer experience across infrastructure and application stacks.

Top Skills: AWSChatgptClaudeCopilotDockerElasticsearchGithub ActionsGoGrafanaKubernetesOpensearchOpsgeniePrometheusSpring Boot

HiBob

Senior Site Reliability Engineer - Remote EST

Reposted 2 Days AgoSaved

Remote or Hybrid

United States

190K-235K Annually

Senior level

190K-235K Annually

Senior level

HR Tech • Information Technology • Professional Services • Sales • Software

Own and operate production-grade Kubernetes infrastructure on AWS, build GitOps CI/CD with GitHub Actions and ArgoCD, develop AI agents and internal DevOps tooling, maintain Datadog-based observability, and manage on-call incident response while collaborating with engineering teams to improve reliability and delivery speed.

Top Skills: Ai/LlmArgocdAWSCi/CdDatadogGithub ActionsGitopsGoKubernetesPython

Standard Template Labs

Sr. Site Reliability Engineer

Reposted 4 Days AgoSaved

In-Office

New York, NY, USA

160K-230K Annually

Senior level

160K-230K Annually

Senior level

Artificial Intelligence • Information Technology • Software

The role involves designing and managing multi-cloud infrastructure, implementing CI/CD pipelines, ensuring platform reliability, scalability, and security, while optimizing performance for a SaaS platform used by enterprise customers.

Top Skills: ArgoAWSAzureDatadogDockerGCPGithub ActionsGoKubernetesPythonTerraform

MongoDB

Senior Site Reliability Engineer, Fleet Management

Reposted 4 Days AgoSaved

Easy Apply

Remote or Hybrid

9 Locations

Easy Apply

127K-249K Annually

Senior level

127K-249K Annually

Senior level

Big Data • Cloud • Software • Database

Develop and maintain Kubernetes runtime environments, support developers, resolve critical issues, and participate in on-call rotations for production systems.

Top Skills: AWSAzureCert-ManagerCorednsCrdsCriCsiGatekeeperGCPGoHelmKubernetesKustomizeOperatorsPythonTerraform

January

Senior SRE, Software Engineering

Reposted 4 Days AgoSaved

Hybrid

New York City, NY, USA

205K-225K Annually

Senior level

205K-225K Annually

Senior level

Artificial Intelligence • Fintech • Payments • Social Impact • Analytics • Financial Services • Automation

As a Senior SRE, you'll ensure reliable and scalable systems, develop observability solutions and infrastructure as code, and lead incident response efforts.

Top Skills: AWSCloudFormationDatadogElkPrometheusTerraform

SilverTech, Inc.

Site Reliability Engineer III

Reposted 58 Minutes AgoSaved

In-Office

Bedford, NH, USA

Senior level

Agency • Marketing Tech • Software • Consulting

Lead and maintain performance, security, and reliability of client hosting environments across multi-cloud platforms. Architect resilient infrastructure, manage IaC and CI/CD, administer Windows/IIS and WP Engine environments, handle SSL/DNS/SSO, participate in on-call rotations, and engage with clients as senior escalation and trusted advisor.

Top Skills: App ServicesApplication InsightsAWSAzureAzure DevopsAzure SqlCi/CdDnsEc2IamIisKey VaultsPowershellRdsS3Ssl/TlsSsoTerraformVulnerability ManagementWindows ServerWp Engine

TD Bank

Site Reliability Engineer II (US)

2 Hours AgoSaved

In-Office

Charlotte, NC, USA

79K-128K Annually

Senior level

79K-128K Annually

Senior level

Fintech • Insurance • Financial Services

Provide SRE support for platform-level applications: incident management, performance/availability monitoring, root cause analysis, automation to reduce toil, disaster recovery participation, and technical leadership for reliability improvements.

Top Skills: AWSAzureSQL ServerWindows

Comcast

Sr. Site Reliability Engineer, Data - FreeWheel

5 Days AgoSaved

Hybrid

Reston, VA, USA

Senior level

Digital Media • Information Technology • News + Entertainment

Responsible for ensuring reliability, scalability, and performance of data platforms: monitoring, incident response, automation, performance tuning, capacity planning, security/compliance, documentation, and cross-team collaboration to support large-scale data pipelines and backend data systems.

Top Skills: AerospikeAnsibleAWSAws S3AzureCassandraCi/CdContainerizationDockerElk StackGCPGoGrafanaHadoopHdfsJavaKafkaKubernetesMicroservicesMySQLNoSQLPostgresPrometheusPythonScalaSnowflakeSparkTerraform

New

Track Smarter, Apply Better.

Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.

Use For Free

Comcast

Sr. Site Reliability Engineer, Data - FreeWheel

5 Days AgoSaved

Hybrid

Chicago, IL, USA

118K-176K Annually

Senior level

118K-176K Annually

Senior level

Digital Media • Information Technology • News + Entertainment

Responsible for ensuring reliability, scalability, and performance of data platforms. Design monitoring and alerting, automate deployments and recovery, optimize storage and query performance, troubleshoot incidents, plan capacity and scaling, document operations, enforce security/compliance, and collaborate with data engineering, product, and data science teams to maintain high availability of large-scale data systems.

Top Skills: AnsibleAWSAzureCi/CdDockerElk StackGCPGoGrafanaJavaKubernetesMySQLNoSQLPostgresPrometheusPythonScalaTerraform

Obsidian Security

Site Reliability Engineer

14 Hours AgoSaved

In-Office

Palo Alto, CA, USA

165K-190K Annually

Mid level

165K-190K Annually

Mid level

Cybersecurity

Ensure reliability, scalability, observability, and cost efficiency of a customer-facing SaaS security platform. Manage Kubernetes/Helm deployments, CI/CD (GitLab/ArgoCD), monitoring, and service verification. Embed with engineering teams, optimize developer CI/CD workflows, monitor and debug production on AWS/GCP, and participate in a 24/7 on-call rotation.

Top Skills: ArgocdAWSGCPGitlab Ci/CdGrafanaHelmKubernetesMicroservicesPrometheus

Radiant Digital

Site Reliability Engineering (SRE) Architect

21 Hours AgoSaved

In-Office

Dallas, TX, USA

Expert/Leader

Information Technology

Design and architect highly available OSS/BSS and mainframe systems using SRE principles. Lead reliability, observability, automation, disaster recovery, incident management, and cross-functional transformations across hybrid cloud and on‑prem environments for telecom operations.

Top Skills: AppdynamicsCi/CdCicsDb2DevOpsDynatraceGrafanaHybrid CloudIbm MainframeIbm NetcoolIbm Z/OsImsInfrastructure As CodeInstanaJclLinuxSolarisSplunkSreTelcordia FacsTelcordia SoacTelcordia SwitchTelcordia TirksTelcordia WfaVsam

Specter

Site Reliability Engineer

21 Hours AgoSaved

In-Office

San Francisco, CA, USA

Mid level

Artificial Intelligence • Big Data • Information Technology • Software • Analytics

Own reliability for a live fleet of Linux-based edge sensors and cloud infrastructure. Triage and recover field hardware, perform SSH-based diagnostics, build fleet management and OTA systems, implement observability and alerting, automate operational tasks, develop runbooks, and participate in on-call rotations to prevent and resolve incidents.

Top Skills: AWSBashCDnsDockerFirewallsGoIamKubernetesLinuxPythonRustSshVpn

Vizcom

Senior Platform & Reliability Engineer (SRE)

Reposted 21 Hours AgoSaved

Hybrid

San Francisco, CA, USA

Senior level

Artificial Intelligence • Information Technology • Software

Lead end-to-end platform reliability: define SLIs/SLOs, harden production architecture, ensure Kubernetes runtime and queue safety, run incident command for Sev1/Sev2, own observability/on-call/runbooks, and gate risky releases while delivering a prioritized reliability roadmap.

Top Skills: BullmqKoaKubernetesNode.jsPostgraphilePostgresReactRedisTypescript

Booz Allen Hamilton

Site Reliability Engineer, Senior

Reposted 21 Hours AgoSaved

In-Office

Aurora, CO, USA

87K-198K Annually

Senior level

87K-198K Annually

Senior level

Information Technology

As a Senior Site Reliability Engineer, you'll enhance system resilience, automate tasks, and improve infrastructure for the Intelligence Community. You'll need significant Linux experience and programming knowledge.

Top Skills: ConfluenceDockerGitGoHpJavaJenkinsJIRAKubernetesLinuxNessusPackerPythonRust

Bridge Defense

Lead Site Reliability Engineer

Reposted 21 Hours AgoSaved

In-Office

Washington, DC, USA

Mid level

Aerospace • Defense • Manufacturing

As Lead Site Reliability Engineer, you'll ensure reliability and performance of AI infrastructure, manage deployments, and mentor junior engineers.

Top Skills: AnsibleBmcCi/CdCudaIdracImpiKubernetesLinuxNvidia GpusOpenshiftTerraform

QuEra Computing

Sr. Control System Engineer/Site Reliability Engineer (SRE)

Reposted 21 Hours AgoSaved

In-Office

Boston, MA, USA

Senior level

Hardware • Quantum Computing

Maintain and integrate hardware and software systems for quantum controls, manage lab and test infrastructure (HIL, K8s, networking, rack servers), automate provisioning and CI/CD, implement monitoring/alerting and observability, support incident response and root-cause analysis, and define operational procedures to ensure reliability across development and production environments.

Top Skills: AnsibleBashDebianDhcpDnsDockerElk StackGitGitlab CiGoGrafanaHardware-In-The-Loop (Hil)JenkinsKubernetesLanPrometheusPythonRack Mount ServersRed HatRoutersSwitchesTcp/IpTerraformUbuntuVlanWanWindows

Gamma (gamma.app)

Site Reliability Engineer

Reposted 21 Hours AgoSaved

In-Office

San Francisco, CA, USA

230K-310K Annually

Senior level

230K-310K Annually

Senior level

Artificial Intelligence • Software

Own the reliability and performance of backend systems at Gamma, building automation and tooling while leading incident response and improving system stability.

Top Skills: AWSCloudFormationDockerGoKafkaKubernetesNode.jsPythonTerraformTypescript

Unify (unifygtm.com)

Staff Site Reliability Engineer, Tech Lead

Reposted 21 Hours AgoSaved

Remote or Hybrid

2 Locations

250K-295K Annually

Senior level

250K-295K Annually

Senior level

Artificial Intelligence • Software

As Staff SRE Tech Lead, you'll oversee platform reliability and scalability, lead the SRE team, architect data infrastructures, and optimize systems while implementing automation and observability practices.

Top Skills: ClickhouseGoPostgresPythonTypescript

Aalyria

Site Reliability Engineer

Reposted 21 Hours AgoSaved

Remote

United States

115K-135K Annually

Mid level

115K-135K Annually

Mid level

Aerospace • Manufacturing

As a Site Reliability Engineer, you'll build and manage observability platforms for satellite communications, define SLOs/SLIs, and collaborate on incident response and deployment automation.

Top Skills: ArgocdAWSElkGCPGoGrafanaIstioJaegerKubernetesLinkerdLokiOpentelemetryPrometheusPythonTempoTerraform

Bandwidth Inc.

Sr. Network Operations VoIP Engineer (Platform & SRE)

Reposted 21 Hours AgoSaved

In-Office

Raleigh, NC, USA

Senior level

Software

The role involves ensuring reliable SIP connectivity, conducting interoperability testing, troubleshooting SIP issues, and automating operations tasks, while mentoring junior staff.

Top Skills: AnsibleBashEmpirixHepicLinuxPythonRtpSbcsSdpsSipSippVoipWireshark