Top Remote Site Reliability Engineer Jobs

Reposted YesterdaySaved
In-Office or Remote
Eden Prairie, MN, USA
73K-130K Annually
Senior level
73K-130K Annually
Senior level
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
The Software Engineer (SRE) role focuses on improving application stability in cloud environments, facilitating modern application deployments, and ensuring system resilience through incident response and monitoring tools.
Top Skills: AWSAzureCi/CdDynatraceGCPGrafanaPythonReactSplunkTerraform
Reposted YesterdaySaved
Easy Apply
Remote or Hybrid
US
Easy Apply
200K-230K Annually
Senior level
200K-230K Annually
Senior level
Artificial Intelligence • Machine Learning
Lead development of AI-assisted reliability tooling, own incident response end-to-end, improve observability and SLO/SLI frameworks, scale single-tenant SaaS operations, mentor engineers, and reduce recurring operational toil through engineering and automation.
Top Skills: Cloud PlatformsGoKubernetesLinuxLlm/Ai ToolingLogs And TracingObservability ToolingPythonSlo/Sli Frameworks
Reposted YesterdaySaved
Remote
United States
223K-302K Annually
Expert/Leader
223K-302K Annually
Expert/Leader
Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy
The role involves defining reliability strategies, leading initiatives across teams, enhancing monitoring and incident response, and mentoring engineers at Dropbox.
Top Skills: Ai TechnologiesDebuggingDistributed SystemsIncident ResponseObservabilityReliability Risk ManagementSlasSlos
2 Days AgoSaved
Remote or Hybrid
United States
200K-250K Annually
Senior level
200K-250K Annually
Senior level
Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics
Lead long-term strategy and architecture for cloud and on‑prem platform infrastructure, driving Kubernetes and multi‑cloud reliability, IaC/GitOps automation, observability, SLO/SLI/error‑budget practices, incident leadership, AI‑augmented tooling adoption, and mentorship of senior engineers to improve platform resilience and developer experience.
Top Skills: Amazon Elastic Kubernetes Service (Eks)AutoscalingAWSCapacity PlanningCi/CdGitopsGoGoogle Cloud PlatformGoogle Kubernetes Engine (Gke)Identity And Access ManagementInfrastructure As CodeKubernetesLinuxNetworkingObservabilityOperatorsPulumiPythonRke2StorageTerraform
Reposted 3 Days AgoSaved
Remote or Hybrid
4 Locations
175K-175K Annually
Senior level
175K-175K Annually
Senior level
eCommerce • Legal Tech • Professional Services • Software • Data Privacy
The Site Reliability Engineer will ensure systems run smoothly, work with automation tools, resolve issues, and drive operational improvements.
Top Skills: AWSAzureCloudFormationDockerGCPGrafanaKubernetesMemcachedNew RelicOpentelemetryPostgresPrometheusPulumiRedisSentryTerraform
Reposted 4 Days AgoSaved
Easy Apply
Remote or Hybrid
Crystal City, VA, USA
Easy Apply
140K-200K Annually
Senior level
140K-200K Annually
Senior level
Cloud • Information Technology • Security • Software • Cybersecurity
Responsible for managing operations within classified environments, overseeing cloud infrastructure, automating tasks, and ensuring system stability in a high-security setting.
Top Skills: AnsibleAws EcsKubernetesLinuxPythonTerraform
Reposted 4 Days AgoSaved
Remote or Hybrid
New York, NY, USA
130K-170K Annually
Senior level
130K-170K Annually
Senior level
AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Oversee operational support of SAP BTP CPI applications, manage incidents, lead support specialists, and collaborate on architecture and governance for finance processes.
Top Skills: Abap ProxiesAemCapmCloud ConnectorCloud FoundryEdge Integration CellIdocJSONMessage QueuesOauthOdataRestSAMLSap BtpSfapiSftpSoapXML
5 Days AgoSaved
Remote or Hybrid
Centennial, CO, USA
110K-145K Annually
Mid level
110K-145K Annually
Mid level
AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Build and maintain automation and reliability for live video distribution across on-prem and cloud. Deploy and manage systems, develop monitoring and automated recovery, troubleshoot complex incidents, coordinate with vendors, document SOPs, support live broadcast components, and participate in L2 on-call rotation.
Top Skills: AacAc3AnsibleAtscAvcAWSBashChefCloudFormationCmafDockerEksGitHevcHlsJavaScriptJSONKubernetesLinuxMicrosoft Graph ApiMpeg Transport StreamsPythonRistScte104Scte224Scte35SrtSsaiSt2022-7St2110StatmuxTerraformUnixXMLYmlZixi
Reposted 5 Days AgoSaved
Remote or Hybrid
Santa Clara, CA, USA
166K-290K Annually
Senior level
166K-290K Annually
Senior level
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
The Sr Staff Site Reliability Engineer will lead infrastructure projects, design scalable solutions, and collaborate across teams while providing technical support and mentorship.
Top Skills: AWSBashDatadogGitopsGoGrafanaHelmKubernetesLinuxPrometheusPythonTerraform
Reposted 6 Days AgoSaved
Remote or Hybrid
2 Locations
160K-235K Annually
Senior level
160K-235K Annually
Senior level
Artificial Intelligence • Healthtech • Logistics • Social Impact • Software • Telehealth
The Senior Site Reliability Engineer will enhance the reliability and security of infrastructure for in-home healthcare services, using cloud technology and automation to improve systems and processes.
Top Skills: AWSBashGCPPythonTerraformTypescript
7 Days AgoSaved
In-Office or Remote
Eden Prairie, MN, USA
73K-130K Annually
Mid level
73K-130K Annually
Mid level
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
Architect, build, and operate AWS commercial and government cloud infrastructure and platform services. Implement IaC, Kubernetes (EKS/AKS) management, observability, automation, incident response, and compliance (FedRAMP/NIST). Participate in on-call rotations and support production resiliency, performance, and security.
Top Skills: AksArgocdAws VpcAzure DevopsCloudFormationCloudwatchDynatraceEc2EcsEksEksElbFluxGitGitlabGrafanaHelmKmsKubernetesLambdaOwaspPkiPrometheusRdsRedshiftRestful ServicesRoute53S3SplunkTerraformVpc Flow Logs
Reposted 7 Days AgoSaved
Remote or Hybrid
Orlando, FL, USA
Expert/Leader
Expert/Leader
AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
The Staff Site Reliability Engineer is responsible for ensuring the reliability, performance, and security of workplace collaboration services, focusing on automation, incident management, and operational excellence while providing technical leadership and mentoring to engineers.
Top Skills: Ai EngineeringAzure Virtual DesktopDefender For Office 365Exchange OnlineGraph ApiIntuneJamf ProMicrosoft 365Microsoft Entra IdMicrosoft PurviewOnedrivePowershellSharepoint OnlineTeams
New

Cut your apply time in half.

Use ourAI Assistantto automatically fill your job applications.

Use For Free
Application Tracker Preview
Reposted 8 Days AgoSaved
Easy Apply
Remote or Hybrid
9 Locations
Easy Apply
119K-170K Annually
Senior level
119K-170K Annually
Senior level
Cloud • Information Technology • Security • Software • Cybersecurity
As a Staff Site Reliability Engineer, you'll oversee Zscaler production data center services, optimize code, and ensure cloud service availability and performance. Collaborate with cross-functional teams to improve processes and resolve escalated issues.
Top Skills: BashDnsFirewallsGrafanaHTTPIcmpLoad BalancingNagiosOsi ModelPrometheusPythonTcp/Ip
Reposted 8 Days AgoSaved
In-Office or Remote
4 Locations
105K-300K Annually
Entry level
105K-300K Annually
Entry level
Information Technology • Software • Financial Services • Big Data Analytics
SREs at Citadel focus on optimizing and maintaining system reliability, performance, and automation for investment applications, collaborating closely with teams.
Top Skills: Ci/CdCSSJavaScriptPythonReactSQL
Reposted 8 Days AgoSaved
Easy Apply
Remote or Hybrid
6 Locations
Easy Apply
126K-248K Annually
Senior level
126K-248K Annually
Senior level
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will develop and support distributed storage services, ensuring reliability and operational safety, with a focus on automation and efficiency.
Top Skills: AWSAzureDnsGoGoogle Cloud PlatformKubernetesLinuxPythonTcp/IpTls
Reposted 8 Days AgoSaved
Easy Apply
Remote or Hybrid
United States
Easy Apply
127K-249K Annually
Expert/Leader
127K-249K Annually
Expert/Leader
Big Data • Cloud • Software • Database
Seeking a Site Reliability Engineer with expertise in networking and distributed systems for building secure multi-cloud infrastructure. Responsibilities include maintaining network architecture and ensuring reliable service-to-service communication, involving a 24/7 on-call rotation.
Top Skills: AWSAzureBgpDnsGCPIpv6KubernetesLoad BalancingMtlsService MeshTcp/IpTlsVpcsVpns
Reposted 8 Days AgoSaved
In-Office or Remote
New York, NY, USA
150K-250K Annually
Mid level
150K-250K Annually
Mid level
Mobile • Software
Site Reliability Engineers will work on production infrastructure, focusing on AWS and Kubernetes while ensuring high availability and customer satisfaction.
Top Skills: AirflowAWSCircleCICloudwatchEksGrafanaMongoDBPagerdutyPingdomRustScala SparkTerraformTypescript
Reposted 10 Days AgoSaved
Remote or Hybrid
4 Locations
286K-392K Annually
Senior level
286K-392K Annually
Senior level
Fintech • Machine Learning • Payments • Software • Financial Services
The role involves leading the Card Acquisitions engineering organization, promoting engineering excellence, mentoring engineers, and delivering innovative solutions. Responsibilities include system design, hands-on coding, and developing a multi-year strategy to enhance operational efficiency and customer acquisition through advanced technologies.
Top Skills: GoJavaJavaScriptPublic Cloud TechnologiesPythonSpa FrameworksTypescript
Reposted 10 Days AgoSaved
Remote or Hybrid
2 Locations
160K-255K Annually
Senior level
160K-255K Annually
Senior level
Artificial Intelligence • Healthtech • Logistics • Social Impact • Software • Telehealth
The Staff Site Reliability Engineer at Sprinter Health will enhance the reliability and security of cloud infrastructure, automate processes, and improve system observability across healthcare delivery operations.
Top Skills: Access ManagementAWSBashCi/Cd SystemsCloud NetworkingContainer SystemsGCPIdentity ManagementLogging PlatformsMonitoring PlatformsObservability PlatformsPythonSecrets ManagementTerraformTypescript
Reposted 11 Days AgoSaved
Easy Apply
Remote or Hybrid
Ontario, CA, USA
Easy Apply
Senior level
Senior level
Artificial Intelligence • Marketing Tech • Software
Lead technical reliability initiatives across a multi-cloud, multi-region active-active content platform. Architect and evolve core services, observability and logging, automation and capacity planning. Mentor engineers, drive cross-team reliability projects, define standards (IaC, SLOs, on-call) and proactively improve platform scalability and incident outcomes.
Top Skills: Apache KafkaApache PulsarAWSCassandraChefEksGCPGkeGoGrafana AlloyGrafana LokiKubernetesLinuxNode.jsPrometheusPythonRubyScylladbShell ScriptingTempoTerraformThanos
Reposted 13 Days AgoSaved
Remote
USA
150K-220K Annually
Senior level
150K-220K Annually
Senior level
Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
The engineer will build and operate AI/ML infrastructure, managing services on AWS and bare metal, using tools like Kubernetes and Terraform.
Top Skills: AWSBashGoKubernetesPythonSlurmTerraform
14 Days AgoSaved
In-Office or Remote
Minnetonka, MN, USA
Expert/Leader
Expert/Leader
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
Define and scale SRE standards across teams, implement SLOs/SLIs/error budgets, build observability and resiliency patterns, drive automation and AIOps, improve reliability for large-scale Azure cloud systems, and influence engineering and platform teams.
Top Skills: Ai/MlAiopsAutomationAzureError BudgetsIncident ManagementLogsObservability (MetricsOpentelemetrySlisSlosTracing)
15 Days AgoSaved
Easy Apply
Remote or Hybrid
7 Locations
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
Maintain and improve multi-cloud Kubernetes infrastructure, CI/CD (Argo Workflows/ArgoCD), observability, and networking. Build reliable continuous deployment tooling and onboarding flows, provide internal support, collaborate across Platform Engineering, contribute upstream (open-source/operators), and participate in a 24/7 on-call rotation to resolve deployment infrastructure issues.
Top Skills: AlertingArgo WorkflowsArgocdAWSAzureCi/CdContainersDnsGCPGoKubernetesLinuxLoad BalancerObservabilityPythonService MeshTcp/IpTls
Reposted 18 Days AgoSaved
Easy Apply
Remote or Hybrid
USA
Easy Apply
Internship
Internship
Cloud • Information Technology • Security • Software • Cybersecurity
This internship role focuses on SRE skills, requiring collaboration and problem-solving in dynamic environments for Zscaler's Zero Trust Exchange team.
Top Skills: AnsibleAws EcsKubernetesLinuxPythonTerraform
Reposted 18 Days AgoSaved
Easy Apply
Remote or Hybrid
Virginia, USA
Easy Apply
Internship
Internship
Cloud • Information Technology • Security • Software • Cybersecurity
As an intern, manage operational tasks in classified environments, develop automation tools, create documentation, and enhance services for Zscaler's cloud security platform.
Top Skills: Aws EcsKubernetesPython
All Filters
JobType
New Jobs
Job Category
Experience
Industry
Company Name
Company Size

Sign up now Access later

Create Free Account