Get the job you really want.
Maximum of 25 job preferences reached.
Top Site Reliability Engineer Jobs
Computer Vision • Machine Learning • Software
As a Site Reliability Engineer, ensure the reliability, performance, and scalability of Ditto's cloud infrastructure by developing observability solutions, leading incident management, and collaborating with product engineering teams.
Top Skills:
AWSAzureCDatadogGCPGoGrafanaHelmJavaKubernetesPrometheusRustTerraform
Information Technology
The Lead Site Reliability Engineer will ensure platform reliability and performance, guiding SRE principles, managing incidents, and fostering collaboration across teams while leveraging cloud technologies and automation.
Top Skills:
AWSAzureAzure DevopsBashBicepCloudFormationGithub ActionsGoJenkinsPowershellPythonTerraform
Aerospace
The Site Reliability Engineer will ensure system reliability, assist with incident response, improve operational quality, and automate processes to reduce toil. Responsibilities include incident resolution, reliability evaluations, and platform enablement.
Top Skills:
Argo CdDockerGitGitlabGoGrafanaJenkinsKubernetesOtel StandardsPrometheusPython
Artificial Intelligence • Healthtech • Software
The Staff Site Reliability Engineer will lead the reliability of production systems by defining SRE practices, improving observability, and ensuring fault-tolerance in cloud environments.
Top Skills:
AWSGoKubernetesPostgresPythonTerraformTypescript
Fitness • Healthtech • Retail • Pharmaceutical
The Executive Director of Digital SRE & Operations will lead strategy and execution for enterprise-scale reliability and operational excellence, overseeing AIOps, automation, and DevOps while mentoring SRE teams.
Top Skills:
AiopsAWSAzureDatadogGCPGrafanaOpentelemetryPrometheusSplunk
Digital Media • Social Media • Software • Sports
Lead the technical architecture and execution of migration to AWS, drive developer enablement, and automate infrastructure using code-first principles.
Top Skills:
Aws EksDatadogGithub ActionsGoIstioK6KubernetesNode.jsTerraform
Healthtech • Software
Seeking a Site Reliability Engineer to ensure platform reliability, scalability, and performance by leveraging AI and automation in cloud infrastructure. Responsibilities include incident response, monitoring, and operational efficiency enhancement.
Top Skills:
AIAWSCi/CdTerraform
Software
As a Site Reliability Engineer, you'll enhance system reliability, collaborate on production readiness, define SLIs/SLOs, and improve incident response.
Top Skills:
AWSDatadogGrafanaKubernetesOpentelemetryPrometheusTypescript
Artificial Intelligence • Machine Learning • Generative AI
As a Site Reliability Engineer, you will manage Kubernetes clusters, automate infrastructure, improve operational metrics, and enhance reliability across data centers.
Top Skills:
CloudFormationGoGpuKubernetesLinuxPythonTerraform
Cloud • Security • Software • Cybersecurity
The Senior Lead Site Reliability Engineer will ensure performance and uptime of security products, develop automation pipelines, and improve monitoring systems, working closely with various teams.
Top Skills:
AzureDatabricksDockerGoJenkinsKubernetesPythonTerraform
8 Days AgoSaved
Financial Services
As a Principal Application Support Engineer, you'll ensure system reliability and operational resilience, implementing SRE practices, and leading incident management efforts.
Top Skills:
AWSAzureDynatraceGCPGoItsiJavaLinuxPythonSplunkUnix
Information Technology • Insurance • Software
The Senior Site Reliability Engineer is responsible for the reliability and performance of production services, including incident response, service design, and automation of operations.
Top Skills:
.NetAWSC#Ci/CdInfrastructure As CodeJavaKubernetesLinuxPythonReactWindows
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
AdTech • Artificial Intelligence • Marketing Tech • Software • Analytics
The Senior Site Reliability Engineer will enhance system reliability, develop production-grade code, implement observability tools, conduct root cause analyses, and collaborate on system design for scalability.
Top Skills:
ArgocdCi/CdDockerGitopsGoGrafanaHoneycombJenkinsKubernetesOpentelemetryPrometheusPythonTerraform
Artificial Intelligence • Hardware • Information Technology • Security • Software • Cybersecurity • Big Data Analytics
The Senior Site Reliability Engineer will support cloud operations, implement observability strategies, and optimize applications for availability and performance.
Top Skills:
.NetAnsibleC#GitGrafanaKubernetesPrometheus
Marketing Tech • Mobile • Software
As a Senior Site Reliability Engineer, you will maintain service uptime, improve automation, and ensure infrastructure reliability while collaborating with engineering teams at Braze.
Top Skills:
ChefDockerKafkaKubernetesLinuxMongoDBRedisRuby On RailsTerraformUnix Shell
Gaming • Information Technology • Mobile • Software • Esports
Seeking a Senior Site Reliability Engineer to design and operate scalable platform solutions, enhance reliability, and improve developer experience and operational efficiency across engineering teams.
Top Skills:
AWSGCP
Aerospace • Artificial Intelligence • Hardware • Robotics • Security • Software • Defense
Responsible for deploying and managing cloud environments, integrating platform services, enhancing data pipelines, and collaborating on operational testing for TRS systems.
Top Skills:
AnsibleAWSAzureConfluenceCudaDockerGCPGitGithub ActionsGrafanaJfrog ArtifactoryJIRAKubernetesNominalOpenclPythonTerraform
Cloud
The role involves building and managing observability infrastructure in GCP, automating deployments, and optimizing data processes for high reliability.
Top Skills:
GkeGoGCPGrafanaKubernetesOpentelemetryPythonRubySplunkTerraform
Edtech
The Lead Software Engineer will lead the SRE team, focusing on reliability, performance optimization, security, and mentoring developers, while improving overall platform resilience.
Top Skills:
ActivejobAnsibleAWSAws CloudwatchEc2EcsElasticsearchGitGCPGoogle Cloud StackdriverJenkinsJIRAKubernetesMemcachedMongoDBNew RelicNode.jsPostgresRedisRuby On RailsSidekiqSpinnakerTerraformTerragrunt
Fintech • Analytics
The role involves managing application services, driving improvements, handling incidents, and leveraging domain knowledge to enhance service quality and efficiency.
Top Skills:
DatadogItrs
Software
The Site Reliability Engineer will enhance monitoring systems, improve user experience, optimize alerting, and analyze data for informed decision-making.
Top Skills:
AnsibleAWSAzureBashDatadogElk StackGCPGitGrafanaJenkinsNagiosNew RelicPowershellPrometheusPythonTerraform
Financial Services
The Staff Engineer will support and optimize messaging platforms, design solutions to improve operational efficiency, and collaborate with teams on business-focused solutions.
Top Skills:
AmpsAWSEksFixJavaKafkaKubernetesLinuxMqSpringSQL
Travel
Deploy, operate, and automate large-scale cloud-native and Kubernetes workloads (GKE) with emphasis on reliability, observability, SLO/SLA design, GitOps deployments, on-call incident response, and building self-service platforms and automation to reduce operational toil.
Top Skills:
Alpine/Distroless)AnsibleArgocdBashClaude CodeCursorGCPGithub ActionsGithub CopilotGitopsGoGrafanaHelmIstioKubernetes (Gke)KustomizeKyvernoLinux (Rhel/RockyNew RelicOpentelemetryPrometheusPythonSplunkTerraformUbuntu
Fintech
The Site Reliability Engineer will manage AWS infrastructures, oversee application deployments, and ensure system reliability and security while collaborating with teams.
Top Skills:
AWSBashCodebuildCodedeployCodepipelineEc2IamPythonRdsRoute 53S3TerraformVpc
Fintech
Lead adoption of SRE practices to improve reliability, observability, automation, and incident response. Implement and maintain observability tooling, instrumentation, CI/CD, and infrastructure-as-code. Partner with developers, participate in on-call rotations, drive postmortems, and reduce operational overhead through automation.
Top Skills:
AnthropicAWSAws EcsAws EksAzureC#DockerGitlab CiGrafanaLinuxOpenaiPrometheusPuppetPythonSplunkTerraformTypescriptWindows
Top Companies Hiring Site Reliability Engineers
See AllPopular Job Searches
All Software Engineer Jobs
.NET Developer Jobs
Aerospace Thermal Engineering Jobs
AI Engineer Jobs
Android Developer Jobs
Automation Engineer Jobs
Backend Developer Jobs
Blockchain Developer Jobs
C# Jobs
C++ Jobs
Cloud Architect Jobs
Cloud Engineer Jobs
Design Engineer Jobs
DevOps Engineer Jobs
Director Of Engineering Jobs
Electrical Engineering Jobs
Embedded Software Engineer Jobs
Engineering Jobs
Engineering Manager Jobs
Environmental Engineering Jobs
Field Engineer Jobs
Front End Developer Jobs
Full Stack Developer Jobs
Game Developer Jobs
Golang Jobs
Hardware Engineer Jobs
Industrial Engineering Jobs
iOS Developer Jobs
Java Developer Jobs
Javascript Developer Jobs
Linux Jobs
Manufacturing Engineer Jobs
Mechanical Engineering Jobs
Network Engineer Jobs
PHP Developer Jobs
Process Engineer Jobs
Project Engineer Jobs
Prompt Engineering Jobs
Python Jobs
QA Jobs
Robotics Engineer Jobs
Ruby on Rails Jobs
Salesforce Administrator Jobs
Salesforce Developer Jobs
Scala Jobs
Sharepoint Developer Jobs
Site Reliability Engineer Jobs
Software Engineering Manager Jobs
Solutions Architect Jobs
SQL Developer Jobs
Structural Engineer Jobs
System Engineer Jobs
Test Engineer Jobs
Web Developer Jobs
All Filters
Total selected ()
No Results
No Results










.jpeg)



.jpg)




.jpg)
















