Maximum of 25 job preferences reached.
Top Site Reliability Engineer Jobs
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
The Senior Site Reliability Engineer will manage and enhance cloud infrastructure, focusing on automation, performance, and security while collaborating with software and DevOps teams.
Top Skills:
ArgocdAzureAzure MonitorDynatraceFluxGrafanaHelmKubernetesPrometheusPulumiRestful ServicesSplunkTerraform
Fintech • Financial Services
The Site Reliability Engineer I will support cloud infrastructure and assist in cloud transformation initiatives, focusing on performance and delivery of public cloud solutions, primarily in Azure. Responsibilities include troubleshooting, monitoring, automation, and contributing to operational readiness practices for cloud services.
Top Skills:
.NetAnsibleAWSAzureAzure CliGCPJenkinsKubernetesLinuxPowershellTerraformWindows
Software
Join the SRE team to improve monitoring, alerting, observability, and reliability of Fireblocks' production systems. Triage incidents, run RCA, create runbooks and automation (Python, Lambda, shell, Ansible, ArgoCD), collaborate with R&D/support, and participate in on-call rotation.
Top Skills:
AnsibleArgocdAWSAws LambdaAzureBashBitbucketC++ChefCoralogixDatadogDockerGerritGitGitlabGCPHelmJavaScriptKubernetesLinuxMySQLNew RelicNginxNode.jsPhabricatorPrometheusPuppetPythonShellSplunk
Big Data • Cloud • Information Technology
The Site Reliability Engineer at Iron Mountain will troubleshoot escalated tickets, manage Windows Server builds, perform security patching, and collaborate with customers and vendors to resolve issues and maintain systems.
Top Skills:
CloudComputeHyper-Converged InfrastructureLinuxMicrosoft Endpoint Configuration ManagerNetworkNutanixPowershellRubrikStorageVirtualizationWindows Server
Information Technology • Software • Cybersecurity • Automation
Design, build, and operate an agentic platform to automate vulnerability remediation and incident response while ensuring reliability in security operations.
Top Skills:
DatadogGitGrafanaLinearLlmsOpentelemetryPrometheusSlack
Automotive • Hardware • Logistics
The Manager of Site Reliability Engineering leads a team to enhance cloud infrastructure reliability, automate processes, and collaborate with various teams to improve service delivery and operations.
Top Skills:
ArgocdCi/CdDatadogDynatraceGCPGoogle Cloud PlatformKubernetesTerraform
Financial Services
The Lead Site Reliability Engineer will establish the SRE operating model, implement AI-enabled reliability use cases, manage reliability metrics, and oversee operational readiness while collaborating with teams and mentoring engineers.
Top Skills:
Ai/MlAnsibleAzure DevopsDockerGithub ActionsGitlab CiJenkinsKubernetesTerraformVMware
Artificial Intelligence • Fintech • Payments • Social Impact • Analytics • Financial Services • Automation
As a Senior SRE, you'll ensure reliable and scalable systems, develop observability solutions and infrastructure as code, and lead incident response efforts.
Top Skills:
AWSCloudFormationDatadogElkPrometheusTerraform
Cybersecurity
As a Sr. Staff Site Reliability Engineer, you will define the reliability vision for a multi-tenant SaaS platform, lead the architecture of detection systems, and partner across teams to improve incident management and system resilience, ensuring issues are resolved before affecting customers.
Top Skills:
ArgocdAWSGCPGitlab Ci/CdGrafanaHelmKubernetesPrometheus
Artificial Intelligence • Cloud • Information Technology • Mobile • Software • Consulting
The role involves designing and implementing OpenTelemetry solutions, optimizing telemetry infrastructure, establishing SRE practices, and managing observability across cloud platforms.
Top Skills:
ArgocdAWSAzureBashCloudFormationDockerGCPGithub ActionsGitlab CiGoJavaJenkinsNode.jsOpentelemetryPowershellPulumiPythonRustTerraform
Other • Energy
Lead SRE practices for GCP-based data platforms, automate workflows, design reliable architectures, mentor engineers, and improve operational processes.
Top Skills:
BigQueryCi/CdCloud LoggingCloud MonitoringCloud StorageCompute EngineDataflowDatastreamGithub ActionsGitlab CiGkeGoogle Cloud PlatformIamKubernetesPub/SubPythonTerraform
Artificial Intelligence • Legal Tech • Professional Services • Software
As a Staff Software Engineer in Site Reliability, you'll manage infrastructure for reliability and scalability, lead incident management, and automate operational tasks.
Top Skills:
AWSAzureBashCloudFormationDatadogGCPGoIncidentioPagerdutyPulumiPythonSentryTerraform
New
Track Smarter, Apply Better.
Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.
Use For Free
Artificial Intelligence • Legal Tech • Professional Services • Software
As a Software Engineer in Site Reliability, you will ensure the reliability and performance of our AI platform through automation and strategic infrastructure management.
Top Skills:
AWSAzureBashCloudFormationDatadogGCPGoKubernetesPagerdutyPythonSentryTerraform
Artificial Intelligence • Big Data • Software
You will manage the infrastructure for the Data Replication team, focusing on Kubernetes, reliability standards, and integrating product features with infrastructure. You'll enhance observability and tooling using AI, ensuring engineers can effectively manage their stack.
Top Skills:
AIAWSCi/CdDatadogGCPGrafanaKubernetesPrometheusTerraform
Fintech • Payments
The Senior Staff SRE leads reliability engineering initiatives, drives operational excellence, mentors staff, and influences architecture to enhance system reliability and performance.
Top Skills:
Ai/MlAWSAzureDockerElk StackGCPGrafanaKubernetesMySQLNoSQLPostgresSplunk
Artificial Intelligence • Insurance • Software • Automation
The Staff Site Reliability Engineer will build and scale infrastructure for Assured's platform, automate delivery, enhance observability, and lead mentoring initiatives.
Top Skills:
AWSKubernetesPostgresTerraform
Healthtech • Database
Seeking a Principal Site Reliability Engineer to build a SRE practice, enhance reliability, mentor teams, and drive performance engineering to optimize Quest products and services.
Top Skills:
AnsibleAuroraAWSAzureBigtableCassandraCi/CdCloud Pub/SubCloud SpannerCloud SqlDockerDynamoDBDynatraceGitlabGoGCPJavaJmsKafkaKinesisKubernetesMqPerlPythonRdsRubyShell ScriptingTerraform
Fintech • Insurance • Financial Services
The Senior Site Reliability Engineer will design and maintain scalable infrastructure, develop software for reliability, implement CI/CD pipelines, monitor performance, collaborate on AI/ML workloads, and lead incident response efforts.
Top Skills:
AnsibleAWSAzureDynatraceGitJavaPythonTerraform
Aerospace • Cloud • Software • Defense • Automation
Design and automate cloud systems for U.S. Government, focusing on DevSecOps, reliability, deployment automation, and observability. Participate in on-call rotations, supporting production environments and improving system resilience.
Top Skills:
Aws EksDatadogGitlabGrafanaKubernetesLinux/UnixPythonTerraform
Events
The Site Reliability Engineer II designs and maintains scalable systems, focusing on automation, monitoring, incident response, and collaboration with developers to enhance operational practices and efficiency.
Top Skills:
BashCloud Service OperationsContainersContinuous DeliveryContinuous IntegrationGoInfrastructure As CodeOrchestration PlatformsPython
Artificial Intelligence • Software
The Site Reliability Engineer ensures the reliability and performance of products Devin and Windsurf, managing incident response, CI/CD pipelines, infrastructure as code, and fostering a reliability culture within the engineering team.
Top Skills:
AWSAzureCi/CdGCPKubernetesTerraform
Healthtech • Professional Services • Software
The Sr Software Engineer leads complex software development, ensuring solution scalability, collaborating with teams, solving technical problems, and advocating for high-quality software solutions.
Top Skills:
AngularArgo CdAzure DevopsCi/CdGoogle Cloud PlatformKubernetesNew RelicOpentelemetryRuby On RailsTerraform
Fintech • Financial Services
The Site Reliability Engineer Lead oversees daily operations and architectural resilience, driving SRE principles for application performance and efficiency, and fostering a culture of technical excellence.
Top Skills:
AnsibleAppdynamicsGoGrafanaJavaKubernetesLokiMimirOpenshiftPrometheusPythonTempoTerrraform
Software
Lead SRE to define SRE strategy, architecture, and roadmap; design and operate containerized, compliant cloud environments; build observability, incident management, automation, and developer platform capabilities; mentor SRE team and collaborate with security, compliance, and product teams to ensure reliability at scale.
Top Skills:
AWSAws MarketplaceAzureAzure MarketplaceGCPGoogle Cloud MarketplaceGrafanaKubernetesPrometheusTerraform
Fintech • Analytics
The Site Reliability Engineer will manage production monitoring, incident response, and enhance automation using various tools. They will ensure observability and participate in SRE process improvements.
Top Skills:
AWSCucumberDatadog ApmDatadog DbmDynamoDBEc2EcsElkJavaJenkinsPagerdutyPlaywrightRdsS3Secrets ManagerSeleniumServicenowSplunkSpring Boot
Let Your Resume Do The Work
Upload your resume to be matched with jobs you're a great fit for.
Success! We'll use this to further personalize your experience.
Top Companies Hiring Site Reliability Engineers
See AllPopular Job Searches
All Software Engineer Jobs
.NET Developer Jobs
Aerospace Thermal Engineering Jobs
AI Engineer Jobs
Android Developer Jobs
Automation Engineer Jobs
Backend Developer Jobs
Blockchain Developer Jobs
C# Jobs
C++ Jobs
Cloud Architect Jobs
Cloud Engineer Jobs
Design Engineer Jobs
DevOps Engineer Jobs
Director Of Engineering Jobs
Electrical Engineering Jobs
Embedded Software Engineer Jobs
Engineering Jobs
Engineering Manager Jobs
Environmental Engineering Jobs
Field Engineer Jobs
Front End Developer Jobs
Full Stack Developer Jobs
Game Developer Jobs
Golang Jobs
Hardware Engineer Jobs
Industrial Engineering Jobs
iOS Developer Jobs
Java Developer Jobs
Javascript Developer Jobs
Linux Jobs
Manufacturing Engineer Jobs
Mechanical Engineering Jobs
Network Engineer Jobs
PHP Developer Jobs
Process Engineer Jobs
Project Engineer Jobs
Prompt Engineering Jobs
Python Jobs
QA Jobs
Robotics Engineer Jobs
Ruby on Rails Jobs
Salesforce Administrator Jobs
Salesforce Developer Jobs
Scala Jobs
Sharepoint Developer Jobs
Site Reliability Engineer Jobs
Software Engineering Manager Jobs
Solutions Architect Jobs
SQL Developer Jobs
Structural Engineer Jobs
System Engineer Jobs
Test Engineer Jobs
Web Developer Jobs
All Filters
Total selected ()
No Results
No Results

































