Job Title, Company or Keyword

Maximum of 25 job preferences reached.

Top Remote Site Reliability Engineer Jobs

GitLab

Site Reliability Engineer, Cloud Cost Utilization

Reposted 18 Days AgoSaved

Easy Apply

Remote

Easy Apply

Mid level

Cloud • Security • Software • Cybersecurity • Automation

As a Cloud Cost Utilization SRE at GitLab, you'll manage cloud spending, improve tracking and optimization of cloud usage, and collaborate with finance and engineering teams to enhance cost efficiency across AWS and GCP.

Top Skills: AnsibleAWSElkGCPGrafanaLokiMimirPrometheusTempoTerraform

Optum

Site Reliability Engineer - Remote

Reposted 19 Days AgoSaved

In-Office or Remote

Eden Prairie, MN, USA

73K-130K Annually

Mid level

73K-130K Annually

Mid level

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics

The Site Reliability Engineer will design, develop, and support a secure cloud infrastructure while collaborating with development and DevOps teams, ensuring high performance and reliability of systems.

Top Skills: AWSAzureDynatraceGrafanaKubernetesPrometheusPulumiSplunkTerraform

MongoDB

Site Reliability Engineer (Senior or Staff), Infrastructure Security

Reposted 21 Days AgoSaved

Easy Apply

Remote or Hybrid

5 Locations

Easy Apply

127K-249K Annually

Senior level

127K-249K Annually

Senior level

Big Data • Cloud • Software • Database

The Senior Site Reliability Engineer will lead security design and implementation for cloud infrastructures, mentor teams, and automate security solutions.

Top Skills: AnsibleAWSAzureCloud Security ToolsCloudFormationGCPGoTerraform

Coinbase

Staff Site Reliability Engineer, Core AI Infrastructure

23 Days AgoSaved

Easy Apply

Remote

USA

Easy Apply

218K-257K Annually

Senior level

218K-257K Annually

Senior level

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3

Own reliability, monitoring, and incident response for AI infrastructure; build automation and CI/CD tooling; manage Kubernetes/Docker production workloads; partner with infrastructure, security, and compliance; improve observability and documentation; develop internal full‑stack tooling in Go or Python.

Top Skills: AnsibleAWSBashChefCi/CdDockerEc2GitGoKubernetesLinuxLog AggregationNetwork SecurityPuppetPythonRubySaltTerraform

Mastercard

Senior Site Reliability Engineer

Reposted YesterdaySaved

Remote or Hybrid

Salt Lake City, UT, USA

96K-163K Annually

Senior level

96K-163K Annually

Senior level

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing

Lead reliability, scalability, and production operations for a greenfield enterprise application. Influence design for production readiness, own incident response, define SLIs/SLOs, build observability and automation, enhance CI/CD, and improve developer experience across infrastructure and application stacks.

Top Skills: AWSChatgptClaudeCopilotDockerElasticsearchGithub ActionsGoGrafanaKubernetesOpensearchOpsgeniePrometheusSpring Boot

HiBob

Senior Site Reliability Engineer - Remote EST

Reposted YesterdaySaved

Remote or Hybrid

United States

190K-235K Annually

Senior level

190K-235K Annually

Senior level

HR Tech • Information Technology • Professional Services • Sales • Software

Own and operate production-grade Kubernetes infrastructure on AWS, build GitOps CI/CD with GitHub Actions and ArgoCD, develop AI agents and internal DevOps tooling, maintain Datadog-based observability, and manage on-call incident response while collaborating with engineering teams to improve reliability and delivery speed.

Top Skills: Ai/LlmArgocdAWSCi/CdDatadogGithub ActionsGitopsGoKubernetesPython

MongoDB

Site Reliability Engineer (Senior or Staff), Atlas

Reposted 25 Days AgoSaved

Easy Apply

Remote or Hybrid

10 Locations

Easy Apply

127K-249K Annually

Senior level

127K-249K Annually

Senior level

Big Data • Cloud • Software • Database

As a Senior Site Reliability Engineer, you'll design and build complex systems, support Atlas platform operations, automate processes, and ensure high availability of services.

Top Skills: AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls

MongoDB

Senior Site Reliability Engineer, Fleet Management

Reposted 3 Days AgoSaved

Easy Apply

Remote or Hybrid

9 Locations

Easy Apply

127K-249K Annually

Senior level

127K-249K Annually

Senior level

Big Data • Cloud • Software • Database

Develop and maintain Kubernetes runtime environments, support developers, resolve critical issues, and participate in on-call rotations for production systems.

Top Skills: AWSAzureCert-ManagerCorednsCrdsCriCsiGatekeeperGCPGoHelmKubernetesKustomizeOperatorsPythonTerraform

Tekmetric

Site Reliability Engineer

Reposted 13 Hours AgoSaved

Remote

United States

Senior level

Automotive

Design and implement scalable cloud infrastructure, monitor performance, automate processes, ensure security and compliance, and lead a DevOps team.

Top Skills: AWSBashCi/CdDockerElk StackGCPGrafanaKubernetesPrometheusPythonTerraform

StackBlitz

Staff Site Reliability Engineer

YesterdaySaved

Remote

USA

Senior level

Software • Web3

Lead reliability practices across teams: embed early in projects, define SLIs/SLOs, build multi-cloud paved roads with Terraform, run on-call, drive org-wide incident maturity and tooling.

Top Skills: AWSAzureGCPRuby On RailsTerraformTypescriptWebcontainers

Solventum

Site Reliability Engineer

YesterdaySaved

Remote

2 Locations

124K-171K Annually

Senior level

124K-171K Annually

Senior level

Healthtech • Pharmaceutical • Manufacturing

Support and maintain production Core Speech systems: deploy, monitor, alert, perform capacity planning, respond to on-call incidents, and drive system performance and architecture improvements.

Top Skills: AnsibleAws CloudfrontAws DocumentdbAws Ec2Aws EfsAws EksAws RdsAws S3ContainerdDockerElasticsearchFilebeatGitGitGitlabGoGocdGrafanaJavaJythonKibanaKubernetesLogstashMongoDBPostgresPythonRedisShellSolrTerraform

Fabric Health

Site Reliability Engineer

Reposted YesterdaySaved

In-Office or Remote

New York City, NY, USA

135K-160K Annually

Senior level

135K-160K Annually

Senior level

Artificial Intelligence • Healthtech • Software • Telehealth

Own and evolve Fabric's AWS/EKS infrastructure, build Terraform-managed infrastructure, improve observability with Datadog, lead incident response and SLOs, automate operations with AI/agentic workflows, optimize AWS resources, and ensure HIPAA-compliant, high-availability platform architecture while mentoring engineers.

Top Skills: Agentic WorkflowsAi-Assisted ToolingAWSBashDatadogEc2EksGithub ActionsGoKubernetesPythonRdsRubyS3SemaphoreTerraform

New

Cut your apply time in half.

Use ourAI Assistantto automatically fill your job applications.

Use For Free

Canonical

Site Reliability Engineer

Reposted YesterdaySaved

In-Office or Remote

7 Locations

200K-200K Annually

Mid level

200K-200K Annually

Mid level

Cloud • Software

The Site Reliability Engineer will ensure reliable cloud operations by applying Python for infrastructure automation, managing OpenStack and Kubernetes, and practicing devsecops in a fast-paced environment.

Top Skills: KubernetesLinuxOpenstackPython

i4DM

DevSecOps and Site Reliability Engineering (SRE) Technical Director

Reposted YesterdaySaved

Remote

USA

Senior level

Software

Seeking a Technical Director for DevSecOps and SRE to lead platform reliability, CI/CD automation, and compliance for VA healthcare applications.

Top Skills: AgileAnsibleAWSCi/CdDevsecopsEcsEksKubernetesSafeTerraform

Akamai Technologies

Site Reliability Engineer II

Reposted YesterdaySaved

In-Office or Remote

2 Locations

95K-171K Annually

Junior

95K-171K Annually

Junior

Cloud • Security • Software • Cybersecurity

As a Site Reliability Engineer II, you'll automate tasks, monitor AI workloads, enhance dashboards, support CI/CD processes, and collaborate with engineering teams on complex issues while participating in on-call rotations.

Top Skills: GoGrafanaKubernetesLinuxPrometheusPythonSaltstackTerraform

Remote

United States

Mid level

Software • Consulting

The Senior Application Support Engineer leads efforts to ensure application reliability, manages incidents, collaborates with teams, and monitors performance, providing 24/7 support.

Top Skills: AppdynamicsAWSDatadogLinuxMulesoftOpentelemetryPythonServicenowSplunk

Learning Technologies Group plc

Site Reliability Engineer (Rustici) US, Franklin, Remote

Reposted YesterdaySaved

In-Office or Remote

Franklin, TN, USA

Mid level

Edtech

The Site Reliability Engineer enhances application deployment in AWS, monitors systems, improves automation, and collaborates with teams on security and performance.

Top Skills: AnsibleAWSCloudFormationCSSDockerGithub ActionsGoHTMLInfrastructure As CodeJavaJavaScriptJenkinsKubernetesPythonTerraformTypescript

Core Specialty

Site Reliability Engineer

2 Days AgoSaved

Remote

FL, USA

Senior level

Insurance

Design, build, and maintain highly available cloud-native architectures across Azure and AWS. Implement IaC, observability, SLO/SLI/error budgets, automated remediation, incident response, and resilience patterns. Collaborate with engineering, security, and operations to ensure SLAs, compliance, cost optimization, and disaster recovery.

Top Skills: AksArmAWSAws LambdaAzureAzure Application InsightsAzure Container AppsAzure FunctionsAzure MonitorBicepCi/CdCloudwatchContainersDatadogEksGitopsMicroservicesOpentelemetryServerlessTerraform

Andromeda (andromeda.ai)

Site Reliability Engineer - AI Infrastructure

Reposted 2 Days AgoSaved

In-Office or Remote

8 Locations

Senior level

Artificial Intelligence • Cloud • Information Technology • Software

The Site Reliability Engineer will provision and manage Kubernetes clusters, build automation tools, debug customer issues, and improve infrastructure reliability.

Top Skills: AnsibleBashDatadogGoGrafanaHelmKubernetesLokiPrometheusPythonTerraform