Top Remote Site Reliability Engineer Jobs

Reposted 18 Days AgoSaved
Easy Apply
Remote
US
Easy Apply
Mid level
Mid level
Cloud • Security • Software • Cybersecurity • Automation
As a Cloud Cost Utilization SRE at GitLab, you'll manage cloud spending, improve tracking and optimization of cloud usage, and collaborate with finance and engineering teams to enhance cost efficiency across AWS and GCP.
Top Skills: AnsibleAWSElkGCPGrafanaLokiMimirPrometheusTempoTerraform
Reposted 19 Days AgoSaved
In-Office or Remote
Eden Prairie, MN, USA
73K-130K Annually
Mid level
73K-130K Annually
Mid level
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
The Site Reliability Engineer will design, develop, and support a secure cloud infrastructure while collaborating with development and DevOps teams, ensuring high performance and reliability of systems.
Top Skills: AWSAzureDynatraceGrafanaKubernetesPrometheusPulumiSplunkTerraform
Reposted 21 Days AgoSaved
Easy Apply
Remote or Hybrid
5 Locations
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will lead security design and implementation for cloud infrastructures, mentor teams, and automate security solutions.
Top Skills: AnsibleAWSAzureCloud Security ToolsCloudFormationGCPGoTerraform
23 Days AgoSaved
Easy Apply
Remote
USA
Easy Apply
218K-257K Annually
Senior level
218K-257K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Own reliability, monitoring, and incident response for AI infrastructure; build automation and CI/CD tooling; manage Kubernetes/Docker production workloads; partner with infrastructure, security, and compliance; improve observability and documentation; develop internal full‑stack tooling in Go or Python.
Top Skills: AnsibleAWSBashChefCi/CdDockerEc2GitGoKubernetesLinuxLog AggregationNetwork SecurityPuppetPythonRubySaltTerraform
Reposted YesterdaySaved
Remote or Hybrid
Salt Lake City, UT, USA
96K-163K Annually
Senior level
96K-163K Annually
Senior level
Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
Lead reliability, scalability, and production operations for a greenfield enterprise application. Influence design for production readiness, own incident response, define SLIs/SLOs, build observability and automation, enhance CI/CD, and improve developer experience across infrastructure and application stacks.
Top Skills: AWSChatgptClaudeCopilotDockerElasticsearchGithub ActionsGoGrafanaKubernetesOpensearchOpsgeniePrometheusSpring Boot
Reposted YesterdaySaved
Remote or Hybrid
United States
190K-235K Annually
Senior level
190K-235K Annually
Senior level
HR Tech • Information Technology • Professional Services • Sales • Software
Own and operate production-grade Kubernetes infrastructure on AWS, build GitOps CI/CD with GitHub Actions and ArgoCD, develop AI agents and internal DevOps tooling, maintain Datadog-based observability, and manage on-call incident response while collaborating with engineering teams to improve reliability and delivery speed.
Top Skills: Ai/LlmArgocdAWSCi/CdDatadogGithub ActionsGitopsGoKubernetesPython
Reposted 25 Days AgoSaved
Easy Apply
Remote or Hybrid
10 Locations
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
As a Senior Site Reliability Engineer, you'll design and build complex systems, support Atlas platform operations, automate processes, and ensure high availability of services.
Top Skills: AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
Reposted 3 Days AgoSaved
Easy Apply
Remote or Hybrid
9 Locations
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
Develop and maintain Kubernetes runtime environments, support developers, resolve critical issues, and participate in on-call rotations for production systems.
Top Skills: AWSAzureCert-ManagerCorednsCrdsCriCsiGatekeeperGCPGoHelmKubernetesKustomizeOperatorsPythonTerraform
Reposted 13 Hours AgoSaved
Remote
United States
Senior level
Senior level
Automotive
Design and implement scalable cloud infrastructure, monitor performance, automate processes, ensure security and compliance, and lead a DevOps team.
Top Skills: AWSBashCi/CdDockerElk StackGCPGrafanaKubernetesPrometheusPythonTerraform
YesterdaySaved
Remote
USA
Senior level
Senior level
Software • Web3
Lead reliability practices across teams: embed early in projects, define SLIs/SLOs, build multi-cloud paved roads with Terraform, run on-call, drive org-wide incident maturity and tooling.
Top Skills: AWSAzureGCPRuby On RailsTerraformTypescriptWebcontainers
YesterdaySaved
Remote
2 Locations
124K-171K Annually
Senior level
124K-171K Annually
Senior level
Healthtech • Pharmaceutical • Manufacturing
Support and maintain production Core Speech systems: deploy, monitor, alert, perform capacity planning, respond to on-call incidents, and drive system performance and architecture improvements.
Top Skills: AnsibleAws CloudfrontAws DocumentdbAws Ec2Aws EfsAws EksAws RdsAws S3ContainerdDockerElasticsearchFilebeatGitGitGitlabGoGocdGrafanaJavaJythonKibanaKubernetesLogstashMongoDBPostgresPythonRedisShellSolrTerraform
Reposted YesterdaySaved
In-Office or Remote
New York City, NY, USA
135K-160K Annually
Senior level
135K-160K Annually
Senior level
Artificial Intelligence • Healthtech • Software • Telehealth
Own and evolve Fabric's AWS/EKS infrastructure, build Terraform-managed infrastructure, improve observability with Datadog, lead incident response and SLOs, automate operations with AI/agentic workflows, optimize AWS resources, and ensure HIPAA-compliant, high-availability platform architecture while mentoring engineers.
Top Skills: Agentic WorkflowsAi-Assisted ToolingAWSBashDatadogEc2EksGithub ActionsGoKubernetesPythonRdsRubyS3SemaphoreTerraform
New

Cut your apply time in half.

Use ourAI Assistantto automatically fill your job applications.

Use For Free
Application Tracker Preview
Reposted YesterdaySaved
In-Office or Remote
7 Locations
200K-200K Annually
Mid level
200K-200K Annually
Mid level
Cloud • Software
The Site Reliability Engineer will ensure reliable cloud operations by applying Python for infrastructure automation, managing OpenStack and Kubernetes, and practicing devsecops in a fast-paced environment.
Top Skills: KubernetesLinuxOpenstackPython
Reposted YesterdaySaved
Remote
USA
Senior level
Senior level
Software
Seeking a Technical Director for DevSecOps and SRE to lead platform reliability, CI/CD automation, and compliance for VA healthcare applications.
Top Skills: AgileAnsibleAWSCi/CdDevsecopsEcsEksKubernetesSafeTerraform
Reposted YesterdaySaved
In-Office or Remote
2 Locations
95K-171K Annually
Junior
95K-171K Annually
Junior
Cloud • Security • Software • Cybersecurity
As a Site Reliability Engineer II, you'll automate tasks, monitor AI workloads, enhance dashboards, support CI/CD processes, and collaborate with engineering teams on complex issues while participating in on-call rotations.
Top Skills: GoGrafanaKubernetesLinuxPrometheusPythonSaltstackTerraform
Reposted YesterdaySaved
Remote
USA
Mid level
Mid level
Software • Analytics
The role involves automating and managing AWS infrastructure, ensuring reliability and scalability of stateful systems, and optimizing deployment processes. You'll also handle incident responses and improve operational tooling.
Top Skills: AWSKubernetesTerraformTerragrunt
Reposted YesterdaySaved
Remote
United States
220K-250K Annually
Expert/Leader
220K-250K Annually
Expert/Leader
Cloud • Software • Database
Lead design, build, and operate the YugabyteDB DBaaS infrastructure. Drive architecture, automate lifecycle and maintenance, manage incidents and on-call rotations, implement security/encryption processes, and optimize reliability using SRE principles and observability.
Top Skills: AksAnsibleAWSAzureBashDockerEksGCPGitGithub ActionsGkeJavaKubernetesLinuxPostgresPrometheusPythonShellTerraform
Reposted YesterdaySaved
Remote
United States
133K-211K Annually
Mid level
133K-211K Annually
Mid level
Cloud • Security • Software • Generative AI
Design, build, and automate large-scale multi-cloud infrastructure and internal SRE tools. Improve host lifecycle, observability, alerting, and reliability; operate containerized workloads; participate in on-call rotations, incident response, runbooks, postmortems, code reviews, and mentoring.
Top Skills: AnsibleArgo CdArgo WorkflowsCueDockerElastic StackGoGraphiteInfluxKubernetesLinuxPrometheusPuppetTerraformUbuntuUbuntu Live Patch
Reposted YesterdaySaved
In-Office or Remote
2 Locations
165K-215K Annually
Senior level
165K-215K Annually
Senior level
Software • Cybersecurity
This role involves managing Kubernetes clusters, cloud infrastructure, and CI/CD pipelines. The engineer will enhance system reliability and efficiency while troubleshooting production issues.
Top Skills: AlertmanagerAWSAzureBashCi/CdDockerElastic StackElasticsearchGCPGoGrafanaHelmKafkaKubernetesLokiMongoDBOciPrometheusPythonRedisSparkTerraform
Reposted YesterdaySaved
In-Office or Remote
12 Locations
140K-205K Annually
Senior level
140K-205K Annually
Senior level
Information Technology • Legal Tech
The Senior Technology Site Reliability Engineer is responsible for maintaining and optimizing infrastructure and applications, ensuring reliability and performance while automating processes and collaborating with teams.
Top Skills: AWSChefDatadogGoGrafanaJavaPrometheusPuppetPythonSaltTerraform
Reposted YesterdaySaved
Remote or Hybrid
4 Locations
160K-180K Annually
Senior level
160K-180K Annually
Senior level
Artificial Intelligence • Machine Learning • Software • Analytics
The role involves end-to-end ownership of AWS infrastructure, managing Kubernetes platforms, and ensuring system reliability through observability and automation. Responsibilities include incident response and maintaining CI/CD systems.
Top Skills: ArgocdAWSDatadogGitGoKubernetesPythonTerraform
Reposted YesterdaySaved
Remote
United States
Mid level
Mid level
Software • Consulting
The Senior Application Support Engineer leads efforts to ensure application reliability, manages incidents, collaborates with teams, and monitors performance, providing 24/7 support.
Top Skills: AppdynamicsAWSDatadogLinuxMulesoftOpentelemetryPythonServicenowSplunk
Reposted YesterdaySaved
In-Office or Remote
Franklin, TN, USA
Mid level
Mid level
Edtech
The Site Reliability Engineer enhances application deployment in AWS, monitors systems, improves automation, and collaborates with teams on security and performance.
Top Skills: AnsibleAWSCloudFormationCSSDockerGithub ActionsGoHTMLInfrastructure As CodeJavaJavaScriptJenkinsKubernetesPythonTerraformTypescript
2 Days AgoSaved
Remote
FL, USA
Senior level
Senior level
Insurance
Design, build, and maintain highly available cloud-native architectures across Azure and AWS. Implement IaC, observability, SLO/SLI/error budgets, automated remediation, incident response, and resilience patterns. Collaborate with engineering, security, and operations to ensure SLAs, compliance, cost optimization, and disaster recovery.
Top Skills: AksArmAWSAws LambdaAzureAzure Application InsightsAzure Container AppsAzure FunctionsAzure MonitorBicepCi/CdCloudwatchContainersDatadogEksGitopsMicroservicesOpentelemetryServerlessTerraform
Reposted 2 Days AgoSaved
In-Office or Remote
8 Locations
Senior level
Senior level
Artificial Intelligence • Cloud • Information Technology • Software
The Site Reliability Engineer will provision and manage Kubernetes clusters, build automation tools, debug customer issues, and improve infrastructure reliability.
Top Skills: AnsibleBashDatadogGoGrafanaHelmKubernetesLokiPrometheusPythonTerraform
All Filters
JobType
New Jobs
Job Category
Experience
Industry
Company Name
Company Size

Sign up now Access later

Create Free Account