Get the job you really want.
Maximum of 25 job preferences reached.
Top Senior Site Reliability Engineer Jobs
Fintech • Payments
As a Site Reliability Engineer, you will monitor Azure Cloud systems, automate processes, respond to incidents, and collaborate with development teams to enhance reliability and performance.
Top Skills:
AzureBashDockerElk StackGoGrafanaKubernetesPrometheusPythonSplunkTerraform
Artificial Intelligence • Cloud • Information Technology • Mobile • Software • Consulting
The role involves designing and implementing observability solutions using OpenTelemetry, managing infrastructure through IaC, and establishing SRE practices. Strong expertise in cloud and DevOps engineering is required.
Top Skills:
ArgocdAWSAzureBashCloudFormationDockerGCPGithub ActionsGitlab CiGoJavaJenkinsKubernetesNode.jsOpentelemetryPowershellPulumiPythonRustTerraform
Cloud • Software • Analytics
The Principal Cloud Site Reliability Engineer will lead the design and implementation of cloud infrastructure, manage CI/CD pipelines, mentor teams, and ensure secure, performant systems in AWS and Azure environments.
Top Skills:
AnsibleAWSAzureBashChefDockerElkGrafanaJenkinsKubernetesMongoDBMySQLPostgresPrometheusPuppetPythonRdsSaltTerraform
Fintech
The Site Reliability Engineer will manage AWS infrastructures, oversee application deployments, and ensure system reliability and security while collaborating with teams.
Top Skills:
AWSBashCodebuildCodedeployCodepipelineEc2IamPythonRdsRoute 53S3TerraformVpc
Security • Software • Cybersecurity
The NetOps SRE ensures network infrastructure reliability, handles troubleshooting, configures routing protocols, and collaborates with teams and customers on issues.
Top Skills:
AnsibleAristaBgpDdosDnsGitJuniperLan/WanMplsPythonUnix
Reposted YesterdaySaved
Fintech • Financial Services
As a Site Reliability Engineer, you'll ensure the reliability of Remote Access platforms, develop strategies for issue detection, and support IT service management processes.
Top Skills:
AzureAzure RunbooksEntra IdItilv4LinuxPowershellPythonSAMLSaseScimServicenowSplunk
AdTech • Beauty • Marketing Tech • Retail • Pharmaceutical
The IT Operations Manager (SRE) role involves ensuring reliability and scalability of manufacturing digital applications, applying SRE principles, and enhancing operational practices including incident management and observability in a hybrid work environment.
Top Skills:
AutomationObservabilitySite Reliability EngineeringSlaSliSlo
Healthtech • Biotech • Pharmaceutical
Design, operate, and automate enterprise Oracle Database platforms across on‑prem, hybrid, and cloud environments. Ensure availability, performance tuning, HA/DR, backup/recovery, security/compliance, monitoring, and lifecycle management while driving IaC automation and SRE practices to reduce toil and improve reliability.
Top Skills:
Active Data GuardAiopsAnsibleAWSAzureCi/CdData PumpExaccExadataGitGrafanaInfrastructure As CodeKubernetesOciOci Resource ManagerOemOracle Cloud InfrastructureOracle Data GuardOracle Database (19C+)Oracle RacPythonRmanServicenowShell ScriptingSplunkSQL ServerTerraform
Marketing Tech • Mobile • Software
As a Senior Site Reliability Engineer, you'll ensure site reliability, improve infrastructure automation, manage incidents, and collaborate with engineering teams to enhance systems.
Top Skills:
DockerGoKafkaKubernetesLinuxMongoDBPostgresRedisRubyTerraform
Marketing Tech • Mobile • Software
As a Senior Site Reliability Engineer, you will ensure the reliability of internal services, improving automation and infrastructure, and collaborating with engineering teams to resolve issues and enhance product performance.
Top Skills:
DockerGoKafkaKubernetesMongoDBPostgresRedisRubyTerraform
Marketing Tech • Mobile • Software
As a Senior Site Reliability Engineer at Braze, you'll ensure uptime for internal services, improve automation, and develop infrastructure tools, collaborating across teams to enhance reliability and scalability.
Top Skills:
ChefDockerKafkaKubernetesMongoDBRedisRuby On RailsTerraform
Reposted YesterdaySaved
Fintech • Analytics
As a Senior Site Reliability Engineer, you'll lead incident recovery, enhance production stability, automate processes, and collaborate with development teams to improve operational efficiency.
Top Skills:
AWSAzureBigpandaCloud-Native ApplicationsDatadogDnsDockerGitHTTPKubernetesShell ScriptingTcp/IpUnix
New
Track Smarter, Apply Better.
Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.
Use For Free
Fintech • Financial Services
The role is a technical lead position focused on improving application stability and service levels, using SRE principles in DevOps. Responsibilities include handling incident management, promoting automation, collaborating with stakeholders, and ensuring overall service health for assigned consumer applications.
Top Skills:
ApigeeAWSCi/CdDatabases (OracleDatapowerDb2)GCPItilJavaKubernetesOseService ManagementUnix Shell Scripting
Artificial Intelligence • Automotive • Internet of Things • Software
As an SRE intern, you will work on system reliability, design dashboards, collaborate with teams, and apply machine learning techniques.
Top Skills:
AWSJavaJavaScriptKubernetesPythonTerraform
Healthtech • Social Impact • Transportation • Telehealth
The Site Reliability Engineer IV will enhance system reliability and performance, maintain infrastructure, troubleshoot incidents, develop automation tools, and provide on-call support.
Top Skills:
.NetAzureCi/CdGitIisJavaScriptJenkinsMicrosoft Development StackPulumiPythonShellSQL ServerTerraform
Cloud • Software • Analytics
Join Arista Networks as a Site Reliability Engineer to manage CloudVision service reliability, scalability, and stability in a FedRAMP environment, focusing on areas like architecture, security, and performance optimization.
Top Skills:
AnsibleBashGCPGkeGoKubernetesPulumiPython
Reposted YesterdaySaved
Easy Apply
Easy Apply
Marketing Tech • Mobile • Software
Lead the Site Reliability Engineering team, ensuring platform reliability, scalability, and developer support while fostering an inclusive environment and coaching team members.
Top Skills:
EmberGoReact
Artificial Intelligence • Information Technology
As a Site Reliability Engineer, maintain user-facing services, implement best practices for reliability, and manage production incidents.
Top Skills:
AnsibleCloud ServicesKubernetesProgramming LanguagesTerraform
Software
As a Principal DevOps Engineer, you'll design and implement scalable, secure solutions for hosting mission-critical applications while ensuring business continuity and high availability, leading to continuous delivery in a microservices environment.
Top Skills:
AWSEcsFargateLinuxMicroservicesNoSQLSQLTerraform
Artificial Intelligence • Fintech • Machine Learning • Natural Language Processing • Business Intelligence
Lead architecture and build reliability platforms, drive AIOps automation, champion SRE practices, lead incident response and postmortems, advance observability, and mentor engineers to improve system reliability and performance.
Top Skills:
AiopsAWSAzureContinuous ProfilingDatadogDnsElkGCPGoGrafanaHttp/SKubernetesLoad BalancingOpentelemetryPrometheusPythonTcp/Ip
Real Estate • Financial Services • PropTech
As a Cloud Infrastructure Intern, you will assist in automating infrastructure, supporting platform creation using Infrastructure as Code, and gaining experience with AWS, Kubernetes, and EC2.
Top Skills:
AWSAzureEc2KubernetesMicro ServicesPython
Cloud • Security • Software • Cybersecurity
As a Site Reliability Engineer II, you'll automate tasks, monitor AI workloads, enhance dashboards, support CI/CD processes, and collaborate with engineering teams on complex issues while participating in on-call rotations.
Top Skills:
GoGrafanaKubernetesLinuxPrometheusPythonSaltstackTerraform
Hardware • Information Technology • Other • Software • Analytics
Responsible for developing and maintaining FedRAMP-compliant SaaS and PaaS systems in AWS GovCloud, ensuring reliability, automation, and security of cloud infrastructure.
Top Skills:
AnsibleAWSAzureBashEcsEksGrafanaKubernetesPerlPowershellPrtgPythonSumo LogicTerraform
Fitness • Healthtech • Retail • Pharmaceutical
Lead the reliability and scalability of integration platforms, manage operations teams, define SLOs/SLIs, and improve automation and system resilience.
Top Skills:
AceApicApigeeApimDatapowerJwtKongKubernetesMqOauth 2.0Splunk
Reposted 2 Days AgoSaved
Easy Apply
Easy Apply
Analytics
The Site Reliability Engineer will ensure the reliability and performance of IaaS services, perform incident resolution, and enhance system reliability through automation while supporting mobility across hybrid infrastructures and collaborating extensively with various teams.
Top Skills:
AnsibleAWSAzureBashGitlab CiJenkinsKubernetesLinuxOpenshiftPythonTerraformVmware Vsphere
Popular Job Searches
All Filters
Total selected ()
No Results
No Results








.jpg)
.jpg)























