Maximum of 25 job preferences reached.
Top Site Reliability Engineer Jobs
Reposted 6 Hours AgoSaved
Easy Apply
Easy Apply
AdTech
As a Site Reliability Engineer, you'll maintain the infrastructure for systems, ensure efficiency, automate processes, monitor databases, and participate in architecture discussions.
Top Skills:
Amazon KinesisAws LambdaAws SnsBigQueryDockerGcp (Google Cloud Platform)GitlabGoogle Cloud FunctionsGoogle Cloud RunGoogle Pub/SubGrafanaIstioKafkaKubernetesMySQLPrometheusSpannerSQLTerraform
Reposted 6 Hours AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will lead security design and implementation for cloud infrastructures, mentor teams, and automate security solutions.
Top Skills:
AnsibleAWSAzureCloud Security ToolsCloudFormationGCPGoTerraform
Artificial Intelligence • Cloud • Enterprise Web • Natural Language Processing • Software • App development • Automation
Design and implement large-scale distributed systems that integrate AI safely and reliably, focusing on infrastructure, observability, and security.
Top Skills:
Cloud NetworkingContainersDistributed SystemsEvent Driven RuntimesKedaKnativeKubernetesMulti Cloud ArchitectureOperating SystemsScalability
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Lead and manage an SRE/Platform engineering team to ensure reliability, scalability, and performance of CrowdStrike's cloud-native security platform. Provide technical leadership, incident command, SLO-driven reliability, capacity planning, automation, and mentorship while collaborating with cross-functional teams.
Top Skills:
Apache FlinkApache KafkaAWSAzureElkGCPGoGrafanaIstioJaegerKubernetesLinkerdOpentelemetryPrometheusSplunk
2 Days AgoSaved
Easy Apply
Easy Apply
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Own reliability, monitoring, and incident response for AI infrastructure; build automation and CI/CD tooling; manage Kubernetes/Docker production workloads; partner with infrastructure, security, and compliance; improve observability and documentation; develop internal full‑stack tooling in Go or Python.
Top Skills:
AnsibleAWSBashChefCi/CdDockerEc2GitGoKubernetesLinuxLog AggregationNetwork SecurityPuppetPythonRubySaltTerraform
Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
Lead Site Reliability Engineer responsible for platform stability, mentoring, and improving application performance through automation, configuration management, and operational readiness.
Top Skills:
GoJavaPythonSpring Framework
Aerospace • Hardware • Information Technology • Security • Software • Cybersecurity • Defense
The role involves deploying and monitoring cloud solutions, ensuring service delivery, resolving issues, scaling systems through automation, and collaborating with teams in a hybrid work environment.
Top Skills:
AnsibleAzure StackCephHelm ChartsIaasJdfsKubernetesNfsObject StorageOpen StackPaasSaaSVMware
Aerospace • Hardware • Information Technology • Security • Software • Cybersecurity • Defense
As a Site Reliability Engineer, you will deploy and monitor cloud solutions, implement automation and infrastructure support, and collaborate across teams to enhance service delivery and performance.
Top Skills:
AnsibleAzure StackCephHelm ChartsIaasJdfsKubernetesNfsObject StorageOpen StackPaasSaaSVMware
Aerospace • Hardware • Robotics • Software • Manufacturing
Design, implement, and maintain a scalable SRE/DevOps platform across cloud and on-prem sites. Ensure uptime, automate deployments with IaC, define SLOs, leverage configuration management, and partner with development and manufacturing teams to increase automation and reliability.
Top Skills:
GCPHelmKubernetesTerraform
Cloud • Information Technology • Security • Software • Cybersecurity
Operate and scale cloud infrastructure for US Government classified environments. Manage deployments, on-call rotation, incident response, automation (scripts, IaC), containerized services, monitoring, and documentation to ensure 24x7 service reliability and prevent recurring incidents.
Top Skills:
AnsibleAWSAws EcsContainersKubernetesLinuxPrivate CloudPythonTerraformVirtual Machines
Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
The Staff Site Reliability Engineer will lead reliability strategies, manage high-risk initiatives, and enhance engineering standards while ensuring system reliability and operational excellence within a hybrid work environment.
Top Skills:
BashCi/CdDatabase ArchitectureGoGoogle Cloud PlatformInfrastructure-As-CodeKubernetesMonitoring PlatformsPulumiPythonTerraform
eCommerce • Fashion • Retail • Sales • Wearables • Design
The Lead Site Reliability Engineer is responsible for ensuring system reliability, uptime, and performance across Tapestry brands. They will develop tools and automation, oversee monitoring solutions, and implement SRE practices to promote reliability. Additionally, the role involves production support and collaboration across engineering teams to manage the Salesforce Commerce Cloud platform effectively.
Top Skills:
AppdynamicsAWSAzureBlue TriangleConfluenceGCPJavaJIRANode.jsPythonQuantum MetricSplunk
New
Track Smarter, Apply Better.
Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.
Use For Free
Reposted 3 Days AgoSaved
Automotive • Big Data • Information Technology • Robotics • Software • Transportation • Manufacturing
The Staff Engineer will define reliability architecture, automate foundational utilities, develop observability tools, ensure environment integrity, and mentor colleagues.
Top Skills:
AnsibleChefDhcpKubernetesLinuxNtpPxe
Reposted 4 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
As a Senior Site Reliability Engineer, you'll design and build complex systems, support Atlas platform operations, automate processes, and ensure high availability of services.
Top Skills:
AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
Cloud • Information Technology • Security • Software • Cybersecurity
The role involves creating scalable solutions using Linux and Kubernetes, troubleshooting performance issues, maintaining security, and writing automation tools.
Top Skills:
AnsibleBashDockerFirewall TechnologiesGoKubernetesKvmLinuxMulti-Factor AuthenticationOpenstackPgpPkiPythonSshUnix
Enterprise Web • Hardware • Internet of Things • Software
The Senior Site Reliability Engineer will mentor teams on observability practices, architect systems for growth, automate developer tasks, and debug production issues.
Top Skills:
GoKubernetesLgtm StackOpentelemetryPrometheusTypescript
Aerospace • Hardware • Information Technology • Security • Software • Cybersecurity • Defense
Design, build, and maintain scalable, reliable infrastructure by applying software engineering to operations. Improve automation, observability, CI/CD, and security hardening; troubleshoot Linux/Unix systems and networking; reduce operational toil while supporting mission-critical defense systems requiring TS/SCI clearance and polygraph.
Top Skills:
DnsGitGithub ActionsGitlab CiGoJavaJenkinsLinuxPythonTcp/IpUnix
Fintech • Information Technology • Payments • Productivity • Software • Travel • Automation
The Site Reliability Engineer will design and develop automated solutions and infrastructure to enhance service reliability and efficiency, collaborating closely with various teams to meet customer needs.
Top Skills:
AIAWSCloudFormationDatadogGoJavaJenkinsKibanaMavenMlNewrelicNode.jsPythonSignalfxTerraform
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
Design, deploy, and maintain Kubernetes-based infrastructure and AI-enabled solutions. Build CI/CD pipelines with GitHub Actions, provision GCP infrastructure with Terraform, manage Kafka streaming, monitor systems with Prometheus and Grafana, write Python automation, troubleshoot incidents, and participate in on-call rotations to ensure high availability and reliability.
Top Skills:
Apache KafkaCloud StorageCompute EngineGithub ActionsGoogle Cloud PlatformGrafanaKubernetesKubernetes EnginePrometheusPythonTerraform
Artificial Intelligence • Cloud • Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
Design, automate, and maintain scalable, secure AWS infrastructure and CI/CD pipelines. Lead observability, incident response, and reliability improvements while modernizing mainframe workloads to AWS and collaborating with engineering, security, and data teams.
Top Skills:
Ai ToolsAWSCi/CdCloudFormationData AnalyticsDatabasesEvent-Driven ArchitectureMainframe DevelopmentMessaging ServicesObservability And Alerting ToolsTerraform
Information Technology • Web3
The Site Reliability Engineer manages AWS Kubernetes infrastructure, ensuring operational excellence, security, and scalability, while implementing reliability improvements and best practices.
Top Skills:
ArgocdAWSBashDatadogEksGoKafkaKubernetesPostgresPythonSysdigTerraform
Information Technology • Software • Financial Services • Big Data Analytics
SREs at Citadel focus on optimizing and maintaining system reliability, performance, and automation for investment applications, collaborating closely with teams.
Top Skills:
Ci/CdCSSJavaScriptPythonReactSQL
Artificial Intelligence • Big Data • Healthtech • Machine Learning • Software • Biotech
Lead the design and operation of a fault-tolerant cloud infrastructure, implement infrastructure-as-code, manage Kubernetes reliability, and mentor engineers.
Top Skills:
AnsibleAWSAzureBashCloudFormationDatadogGCPGithub ActionsGitlab CiGoGrafanaJenkinsKubernetesOpentelemetryPowershellPrometheusPythonTerraform
Reposted 7 Days AgoSaved
Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
The engineer will build and operate AI/ML infrastructure, managing services on AWS and bare metal, using tools like Kubernetes and Terraform.
Top Skills:
AWSBashGoKubernetesPythonSlurmTerraform
Machine Learning • Payments • Security • Software • Financial Services
The Technology Engineer - Mainframe Systems at PNC supports and enhances mainframe environments, ensuring system stability and performance, collaborating with various teams, and managing batch processes and file transfers.
Top Skills:
CobolDb2File-AidIbm Mainframe TechnologiesTsoVsam
Let Your Resume Do The Work
Upload your resume to be matched with jobs you're a great fit for.
Success! We'll use this to further personalize your experience.
Top Companies Hiring Site Reliability Engineers
See AllPopular Job Searches
All Software Engineer Jobs
.NET Developer Jobs
Aerospace Thermal Engineering Jobs
AI Engineer Jobs
Android Developer Jobs
Automation Engineer Jobs
Backend Developer Jobs
Blockchain Developer Jobs
C# Jobs
C++ Jobs
Cloud Architect Jobs
Cloud Engineer Jobs
Design Engineer Jobs
DevOps Engineer Jobs
Director Of Engineering Jobs
Electrical Engineering Jobs
Embedded Software Engineer Jobs
Engineering Jobs
Engineering Manager Jobs
Environmental Engineering Jobs
Field Engineer Jobs
Front End Developer Jobs
Full Stack Developer Jobs
Game Developer Jobs
Golang Jobs
Hardware Engineer Jobs
Industrial Engineering Jobs
iOS Developer Jobs
Java Developer Jobs
Javascript Developer Jobs
Linux Jobs
Manufacturing Engineer Jobs
Mechanical Engineering Jobs
Network Engineer Jobs
PHP Developer Jobs
Process Engineer Jobs
Project Engineer Jobs
Prompt Engineering Jobs
Python Jobs
QA Jobs
Robotics Engineer Jobs
Ruby on Rails Jobs
Salesforce Administrator Jobs
Salesforce Developer Jobs
Scala Jobs
Sharepoint Developer Jobs
Site Reliability Engineer Jobs
Software Engineering Manager Jobs
Solutions Architect Jobs
SQL Developer Jobs
Structural Engineer Jobs
System Engineer Jobs
Test Engineer Jobs
Web Developer Jobs
All Filters
Total selected ()
No Results
No Results


.jpeg)

.png)








.png)




.png)













