Get the job you really want.
            
            Maximum of 25 job preferences reached.
        Top Site Reliability Engineer Jobs
Cybersecurity
As a Principal SRE, build and maintain secure cloud infrastructure, drive automation, and ensure operational excellence in a FedRAMP compliant environment.
Top Skills:
                        BackstageBashDockerFirehydrantGCPGitlab Ci/CdGitopsGoGrafanaJavaKubernetesLokiMySQLNode.jsPagerdutyPrometheusPythonTerraform
Healthtech • Software • Biotech
Lead the evolution of the company's cloud-native infrastructure on AWS and Kubernetes, enhance CI/CD processes, and mentor engineering teams.
Top Skills:
                        Aws,Eks,S3,Iam,Circleci,Github Actions,Argocd,Terrraform,Datadog,Honeycomb
Blockchain • Fintech • Information Technology • Payments • Software • Financial Services • Cryptocurrency
The DevOps/SRE Engineer will ensure the reliability and scalability of crypto and fintech products, automate operations, and optimize system performance.
Top Skills:
                        AnsibleAWSBashChefDigitaloceanGCPGitGnu/LinuxGrafanaHetznerJavaScriptKubernetesPostgresPrometheusPulumiPuppetPythonRabbitMQRedisTerraformTypescriptZabbix
Hardware • Quantum Computing
The Sr. SRE will design and operate reliable systems, focusing on automation, incident management, and infrastructure optimization. Collaborates cross-functionally to ensure operational excellence.
Top Skills:
                        AnsibleAWSAzureBashGCPGoGrafanaHelmKubernetesOpentelemetryPrometheusPythonTerraform
Software • Cryptocurrency
Manage and scale Kubernetes clusters, automate infrastructure, optimize performance, maintain blockchain nodes, and improve system reliability while collaborating with product teams.
Top Skills:
                        Aws (Ec2Aws EksDatadogDockerIam)KubernetesOpentelemetryPulumiRdsS3Terraform
Cloud • Information Technology • Security • Software • Cybersecurity
The Staff Site Reliability Engineer will manage FedRAMP cloud products, perform operational duties, enhance monitoring systems, and automate cloud infrastructure.
Top Skills:
                        AnsibleAWSGovcloudKubernetesLinuxPythonTerraform
Cloud • Information Technology • Security • Software • Cybersecurity
The Staff Site Reliability Engineer will manage FedRAMP cloud product operations, automate processes, handle incidents, and ensure compliance in a hybrid role.
Top Skills:
                        AnsibleAws GovcloudKubernetesLinuxNetworkingPythonTerraform
eCommerce
The Staff Back-end Engineer (SRE) will build, run, and scale ecommerce systems, ensuring reliability and performance for customer-facing services, while utilizing automation and best practices.
Top Skills:
                        AWSAzureDatadogDockerElastic StackGoGoogle Cloud PlatformGrafanaJavaKubernetesNew RelicPrometheusPythonRuby
Information Technology • Software
The Site Reliability Engineer will manage and scale infrastructure, automate deployments, and lead efforts in operational process management while participating in a 24x7 on-call rotation.
Top Skills:
                        AnsibleDockerFreebsdFreeipaJenkinsKubernetesLinuxOpenstackPythonRedhat Enterprise LinuxTerraform
Artificial Intelligence • Software • Generative AI
As a Site Reliability Engineer, you'll design and maintain cloud infrastructure, automate provisioning, ensure system reliability, and mentor junior engineers while leveraging various technologies to optimize performance and security.
Top Skills:
                        AWSAzureDockerElk StackGCPGoGrafanaJavaKubernetesPrometheusPythonScalaTerraform
Software
The Principal Site Reliability Engineer will enhance system reliability, implement monitoring systems, collaborate across teams, and ensure platform uptime and performance.
Top Skills:
                        AWSAzureDatadogGCPGrafanaJavaKubernetesNode.jsPrometheusPython
Artificial Intelligence • Generative AI
Lead GPU cluster design and operations, manage Kubernetes, implement Infrastructure-as-Code, and develop observability stacks for high-performance AI models.
Top Skills:
                        AnsibleArgo CdBashEbpfFluxGitopsGrafanaHelmInfinibandKubernetesNvidia DcgmOpentelemetryPrometheusPythonRdmaTerraform
New
                
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
Healthtech
The role involves designing and implementing the platform, managing CI/CD pipelines, automating tasks, and maintaining cloud environments.
Top Skills:
                        AnsibleAWSDockerGCPGitGroovyIacMySQLPHPTerraform
Gaming
Manage operational tasks for gaming services, design runtime environments, monitor metrics, optimize architecture, and research software solutions.
Top Skills:
                        C/C++GoIstioJavaK8SLinuxMySQLNginxPythonRustShell
Cloud • Security • Software • Cybersecurity
As a Staff Site Reliability Engineer, you will lead SRE initiatives, mentor engineers, ensure system reliability, and drive strategic engineering practices globally.
Top Skills:
                        C#GoGrafanaJavaJavaScriptKubernetesOpentelemetryPrometheusPulumiTerraformTypescript
Cloud • Security • Software • Cybersecurity
The Principal Site Reliability Engineer will lead Veeam's global SRE efforts, focusing on architecture, reliability strategies, and mentorship while influencing cross-functional teams.
Top Skills:
                        Automation ToolingCloud InfrastructureCloud-Native DevelopmentDistributed Systems
Artificial Intelligence • Big Data • Machine Learning • Software
The role involves designing and implementing custom installations of the C3 AI Platform for Federal customers, ensuring uptime, and automating system processes while collaborating with cross-functional teams.
Top Skills:
                        AnsibleAWSAzureBashKubernetesLinuxPuppetPythonRubyTerraform
Fintech • Financial Services
Responsible for network deployments, automation, and system monitoring. Collaborates with teams to enhance network design and performance, ensuring scalability and security.
Top Skills:
                        AnsibleAristaBgpCiscoCloudFormationDatadogFortinetGitJSONJuniperLinuxMplsOspfPrometheusPythonStpTerraformUnixVxlanYaml
Agency • Cloud • Information Technology • Mobile • Software
The role involves designing and implementing observability solutions using OpenTelemetry, managing infrastructure through IaC, and establishing SRE practices. Strong expertise in cloud and DevOps engineering is required.
Top Skills:
                        ArgocdAWSAzureBashCloudFormationDockerGCPGithub ActionsGitlab CiGoJavaJenkinsKubernetesNode.jsOpentelemetryPowershellPulumiPythonRustTerraform
Food • Internet of Things
As a Site Reliability Engineer, you will manage cloud infrastructure, implement SRE best practices, automate tasks, and collaborate with teams for system reliability and performance.
Top Skills:
                        AnsibleAWSAzureBashCircleCIDockerElk StackGCPGithub ActionsGrafanaJenkinsKubernetesLinuxPrometheusPythonTerraformUnix
Software
As a Lead SRE at Commvault, you'll ensure the quality and reliability of the Clumio Data Platform in AWS, collaborating across teams to enhance infrastructure and maintain SLAs.
Top Skills:
                        AWSDockerIp NetworkingItilKubernetesLinuxPythonTerraform
Software
The Site Reliability Engineer will enhance system reliability, improve tooling, oversee incident processes, and collaborate on software maintenance across distributed systems.
Top Skills:
                        ClickhouseGrpcKafkaMongoDBNoSQLPostgresRedpanda
Artificial Intelligence • Blockchain • Internet of Things • Machine Learning • Software • App development • Automation
As a Staff SRE, you will ensure the reliability, scalability, and performance of systems, lead incident management, and drive automation efforts.
Top Skills:
                        AnsibleAWSAzureBashDockerElk StackGCPGitlab CiGoGrafanaJavaJenkinsKubernetesPrometheusPythonTerraform
Artificial Intelligence • Blockchain • Internet of Things • Machine Learning • Software • App development • Automation
Join the Gigster Talent Network as an SRE Support Engineer, providing support for scalable applications and cloud services, including troubleshooting and improving internal tools.
Top Skills:
                        AnsibleAWSBashDatadogDockerGCPGrafanaKafkaKubernetesPrometheusPuppetPythonSparkSplunkTerraform
Cloud • Software
As a Site Reliability Engineer, you'll manage technical escalations, ensure system reliability, collaborate with engineering teams, and participate in on-call rotations.
Top Skills:
                        AnsibleAzureBashC#ChefElkGitGithub ActionsGitlabGrafanaJenkinsLinux/UnixPrometheusPulumiPythonSplunkSvnTcp/IpTerraform
Top Companies Hiring Site Reliability Engineers
See AllPopular Job Searches
All Software Engineer Jobs
.NET Developer Jobs
Aerospace Thermal Engineering Jobs
AI Engineer Jobs
Android Developer Jobs
Automation Engineer Jobs
Backend Developer Jobs
Blockchain Developer Jobs
C# Jobs
C++ Jobs
Cloud Architect Jobs
Cloud Engineer Jobs
Design Engineer Jobs
DevOps Engineer Jobs
Director Of Engineering Jobs
Electrical Engineering Jobs
Embedded Software Engineer Jobs
Engineering Jobs
Engineering Manager Jobs
Environmental Engineering Jobs
Field Engineer Jobs
Front End Developer Jobs
Full Stack Developer Jobs
Game Developer Jobs
Golang Jobs
Hardware Engineer Jobs
Industrial Engineering Jobs
iOS Developer Jobs
Java Developer Jobs
Javascript Developer Jobs
Linux Jobs
Manufacturing Engineer Jobs
Mechanical Engineering Jobs
Network Engineer Jobs
PHP Developer Jobs
Process Engineer Jobs
Project Engineer Jobs
Prompt Engineering Jobs
Python Jobs
QA Jobs
Robotics Engineer Jobs
Ruby on Rails Jobs
Salesforce Administrator Jobs
Salesforce Developer Jobs
Scala Jobs
Sharepoint Developer Jobs
Site Reliability Engineer Jobs
Software Engineering Manager Jobs
Solutions Architect Jobs
SQL Developer Jobs
Structural Engineer Jobs
System Engineer Jobs
Test Engineer Jobs
Web Developer Jobs
All Filters
            
        Total selected ()
No Results
No Results
































