Get the job you really want.
Maximum of 25 job preferences reached.
Top Site Reliability Engineer Jobs
Security • Software • Analytics
Design, operate, and automate scalable, secure infrastructure for Axiom Cloud. Define SLOs, plan disaster recovery and capacity, tune performance, improve deployment practices, build reliability tooling, respond to incidents, and promote monitoring and observability across teams.
Top Skills:
Aws,Docker,Kubernetes,Amazon Eks,Terraform,Pulumi,Linux,Github Actions,Gitlab,Circleci,Llms,Golang,Monitoring And Observability Tools
Cloud • Information Technology • Security • Software
Lead and grow a global Cloud Support/SRE team to ensure SaaS and self-hosted infrastructure reliability. Own incident response for Severity 1 events, refine support workflows, track KPIs (CSAT, MTTR, first-response), and collaborate with Product, Engineering, and Solutions teams to drive product improvements and operational excellence.
Top Skills:
Aws,Azure,Gcp,Linux,Kubernetes,Tcp/Ip,Dns,Load Balancing,Ssl/Tls,Python,Bash,Go
Reposted 22 Days AgoSaved
Fintech • Payments • Financial Services
The Senior Data Platform Administrator will engineer and maintain scalable big data platforms, providing operational excellence and technical guidance while fostering a culture of collaboration and innovation within the team.
Top Skills:
AnsibleAws EmrBashCloudFormationInfrastructure As CodePythonSparkSQLTerraform
Other
As a Platform Engineer/Dev Ops, you will expand cloud infrastructure, implement monitoring systems, manage databases, and leverage CI/CD tools, working collaboratively with various teams.
Top Skills:
AWSAzureBashDatadogElk StackKubernetesOpentofuPrometheusPythonTerraform
Security • Software • Cybersecurity
The Site Reliability Engineer will manage software development tools, optimize configurations, respond to incidents, drive automation, and support migrations for efficiency.
Top Skills:
ArtifactoryAWSAzureBashClickupConfluenceDockerFigmaFullstoryGCPGitGitGrafanaIamJIRAKubernetesOktaPower BIPrometheusPythonSplunkTerraform
Fintech • Insurance • Financial Services
The Senior Site Reliability Engineer will design and maintain scalable infrastructure, develop software solutions for reliability, manage CI/CD pipelines, and collaborate with AI teams to enhance operational excellence.
Top Skills:
AnsibleAWSAzureCi/CdDynatraceGitJavaPythonTerraform
Artificial Intelligence • Big Data • Computer Vision • Machine Learning • Natural Language Processing • Software • Cybersecurity
Maintain and improve the internal developer platform, observability stack, and AWS infrastructure (Terraform); manage Kubernetes at scale; troubleshoot distributed systems; drive security, reliability, cost and performance improvements; partner with product teams and participate in on-call support.
Top Skills:
AWSCkaContainersGoKubernetesLgtm StackLinuxOpensearchPythonServerlessTcp/IpTerraform
Fintech • Financial Services
Design, automate, and maintain reliable, scalable systems; monitor and respond to incidents; perform capacity planning and performance tuning; build operational tooling; collaborate with development teams and lead/coach staff to improve resilience and operational practices.
Fintech • Software
Ensure availability, performance, scalability, and reliability of production systems by defining SLIs/SLOs, implementing monitoring and incident response, automating operations and CI/CD, managing cloud/hybrid infrastructure, capacity planning, and collaborating with engineering and security teams to improve reliability.
Top Skills:
AnsibleAWSAzureBashCi/CdDatadogDnsDockerElkGCPGoGrafanaInfrastructure As CodeKubernetesLinuxLoad BalancingPrometheusPuppetPythonSplunkTerraformUnixVMwareWindows
Security • Cybersecurity
Lead the design and implementation of observability, SLO/SLA frameworks, and AI-enabled infrastructure automation. Architect scalable AWS infrastructure, improve incident management and on-call practices, and drive organization-wide adoption of telemetry and reliability standards.
Top Skills:
Ai-Assisted ToolingAWSCi/CdClaudeCodexCursorGrafanaHoneycombInfrastructure-As-CodeObservabilityPulumiSupabaseTelemetryTerraformVercel
Information Technology
Lead technical strategy for observability, operational intelligence, and reliability. Architect telemetry and automation platforms, drive AIOps and large-scale IaC, lead incident response, mentor senior engineers, and standardize SLO/SLI and reliability practices across AWS cloud-native environments.
Top Skills:
AlbAws (VpcBashCloudFormationDatadogDnsDynamoDBEc2EcsEksGitopsGoGrafanaIamKmsKubernetesLinuxMulti-Account Architectures)New RelicNlbOpentelemetryPolicy-As-CodePrometheusPythonRdsRoute 53S3Tcp/IpTerraformTls
Cloud • Information Technology • Internet of Things • Software • Consulting • Infrastructure as a Service (IaaS) • Automation
Lead technical design and architecture for internal private and multi-cloud infrastructure, manage OpenShift/OpenStack platforms, automate operations, advise customers, and represent Red Hat at open-source events.
Top Skills:
Linux,Osi Layers,Cisco,Juniper,Python,Golang,Rust,Openstack,Openshift,Openshift Virtualization,Rosa,Red Hat Openshift,Kubernetes,Dell,Cisco Ucs,Redfish,Netapp,Aws,Ibm Cloud,Azure,Ci/Cd,Sast,Linting,Unit Testing
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
Healthtech • Payments • Software
The SRE Specialist will design reliability solutions, enhance system observability, respond to incidents, and collaborate with engineering teams to improve data platforms.
Top Skills:
Apache AirflowAWSAzureCloudFormationGCPGrafanaKafkaKubernetesPowershellPrometheusPythonSparkSplunkTerraform
Natural Language Processing • Software • Conversational AI
Maintain and improve reliability of the Echo platform by operating GKE production workloads, implementing GitOps deployments, defining SLOs/SLIs, enhancing observability with OpenTelemetry, troubleshooting incidents, and collaborating with developers on safe CI/CD and progressive delivery.
Top Skills:
Google Kubernetes Engine (Gke),Kubernetes,Gitops,Argocd,Flux,Opentelemetry (Otel),Ci/Cd,Service Mesh,Ingress,Load Balancing,Dns,Cloud Networking
Fintech • Analytics
The Site Reliability Engineer will manage production monitoring, incident response, and enhance automation using various tools. They will ensure observability and participate in SRE process improvements.
Top Skills:
AWSCucumberDatadog ApmDatadog DbmDynamoDBEc2EcsElkJavaJenkinsPagerdutyPlaywrightRdsS3Secrets ManagerSeleniumServicenowSplunkSpring Boot
Information Technology
As a Senior Site Reliability Engineer, you will enhance system resilience, automate tasks, and ensure robust infrastructure for national security.
Top Skills:
ConfluenceDockerGitGoJavaJenkinsJIRAKubernetesLinuxNessusPackerPythonRust
Other
The Sr. Site Reliability Engineer will maintain and administer enterprise systems, troubleshoot operational issues, and develop scripts. This role requires collaboration across teams and participation in project planning and execution.
Top Skills:
AnsibleApacheAzureC#ChefIisJavaJbossPerlPowershellPuppetPythonRubyTomcat
Aerospace • Hardware • Logistics • Robotics • Software • Transportation
The Senior Site Reliability Engineer will lead cloud infrastructure initiatives, develop best practices, write software, and manage systems while working closely with developers. They will also participate in an on-call rotation and set high technical standards for interviews.
Top Skills:
AWSKafkaKubernetes
Aerospace • Hardware • Defense
Lead design, build, and operation of scalable, reliable cloud infrastructure; mentor engineers; make architecture and technology decisions; introduce new tools; lead cross-team initiatives; participate in on-call rotations and incident response.
Top Skills:
AlertingAWSEc2GitopsInfrastructure-As-Code (Iac)KubernetesLambdaMonitoringOn-CallS3Service MeshService RegistrationTerraformVpc
Reposted 12 Hours AgoSaved
Easy Apply
Easy Apply
Real Estate • Software
As a Senior Site Reliability Engineer, you'll enhance system performance, reliability, and cost efficiency in a large-scale production environment, shifting manual operations to AI-assisted engineering.
Top Skills:
AnsibleDatadogElkGrafanaKubernetesLinuxPrometheusPythonRubyTerraform
AdTech • Marketing Tech • Analytics
The Staff SRE DevOps Engineer will manage customer applications, improve system reliability, collaborate on architecture discussions, and support infrastructure needs across teams.
Top Skills:
AWSBashDatadogDockerKafkaKibanaKubernetesLinuxPostgresPythonRedshiftSparkTerraform
AdTech • Marketing Tech • Analytics
As a Staff Software Engineer - SRE, you'll manage cloud infrastructure, improve application reliability, collaborate across teams, and support back-office systems.
Top Skills:
AWSDatadogDockerKafkaKibanaKubernetesLinuxPostgresPythonRdsRedshiftShell/BashSparkTerraform
AdTech • Marketing Tech • Analytics
Manage and support customer applications, improve system reliability, collaborate with teams on infrastructure needs, and help drive architectural decisions.
Top Skills:
Auto ScalingAWSCdnsDatadogDnsDockerKafkaKibanaKubernetesLinuxLoad BalancersPostgresProxy ServersPythonRdsRedshiftShell/BashSparkTerraformWafs
Fintech • Consulting
The Senior Site Reliability Engineer at Equifax ensures service reliability and performance, builds infrastructure as code, manages cloud systems, and leads incident resolution efforts.
Top Skills:
AnsibleArgocdAWSBashChefDatadogDockerGCPGithub ActionsGoJavaJavaScriptJenkinsKubernetesNode.jsPythonTerraform
Fintech • Payments
Senior Site Reliability Engineers at PayPal ensure the reliability and performance of mobile and backend systems, implementing standards, automation, and observability while managing incidents and mentoring junior staff.
Top Skills:
AWSAzureDatadogFirebase CrashlyticsGCPGoPythonSentry
Top Companies Hiring Site Reliability Engineers
See AllPopular Job Searches
All Software Engineer Jobs
.NET Developer Jobs
Aerospace Thermal Engineering Jobs
AI Engineer Jobs
Android Developer Jobs
Automation Engineer Jobs
Backend Developer Jobs
Blockchain Developer Jobs
C# Jobs
C++ Jobs
Cloud Architect Jobs
Cloud Engineer Jobs
Design Engineer Jobs
DevOps Engineer Jobs
Director Of Engineering Jobs
Electrical Engineering Jobs
Embedded Software Engineer Jobs
Engineering Jobs
Engineering Manager Jobs
Environmental Engineering Jobs
Field Engineer Jobs
Front End Developer Jobs
Full Stack Developer Jobs
Game Developer Jobs
Golang Jobs
Hardware Engineer Jobs
Industrial Engineering Jobs
iOS Developer Jobs
Java Developer Jobs
Javascript Developer Jobs
Linux Jobs
Manufacturing Engineer Jobs
Mechanical Engineering Jobs
Network Engineer Jobs
PHP Developer Jobs
Process Engineer Jobs
Project Engineer Jobs
Prompt Engineering Jobs
Python Jobs
QA Jobs
Robotics Engineer Jobs
Ruby on Rails Jobs
Salesforce Administrator Jobs
Salesforce Developer Jobs
Scala Jobs
Sharepoint Developer Jobs
Site Reliability Engineer Jobs
Software Engineering Manager Jobs
Solutions Architect Jobs
SQL Developer Jobs
Structural Engineer Jobs
System Engineer Jobs
Test Engineer Jobs
Web Developer Jobs
All Filters
Total selected ()
No Results
No Results




















.png)












