Get the job you really want.
Maximum of 25 job preferences reached.
Top Senior Site Reliability Engineer Jobs
Reposted 4 Days AgoSaved
Easy Apply
Easy Apply
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
The role involves supporting network infrastructure, automating cloud infrastructure, managing CI/CD workflows, and ensuring operational excellence in IT support, including incident response and security practices.
Top Skills:
AnsibleAWSBashDockerGitKubernetesPythonRubyTerraform
Healthtech • Biotech • Pharmaceutical
Design, operate, and automate enterprise Oracle Database platforms across on‑prem, hybrid, and cloud environments. Ensure availability, performance tuning, HA/DR, backup/recovery, security/compliance, monitoring, and lifecycle management while driving IaC automation and SRE practices to reduce toil and improve reliability.
Top Skills:
Active Data GuardAiopsAnsibleAWSAzureCi/CdData PumpExaccExadataGitGrafanaInfrastructure As CodeKubernetesOciOci Resource ManagerOemOracle Cloud InfrastructureOracle Data GuardOracle Database (19C+)Oracle RacPythonRmanServicenowShell ScriptingSplunkSQL ServerTerraform
Fintech
The role involves building full-stack applications that enhance operational efficiency for SRE teams, integrating tools for automation, reliability, and scalability, along with developing backend services and user interfaces with strong collaboration with SREs.
Top Skills:
AnsibleAWSCloudFormationDatadogGoGrafanaKafkaKubernetesNode.jsOpentelemetryPrometheusPythonReactReact NativeTerraform
Fintech
The Site Reliability Engineer will manage AWS infrastructures, oversee application deployments, and ensure system reliability and security while collaborating with teams.
Top Skills:
AWSBashCodebuildCodedeployCodepipelineEc2IamPythonRdsRoute 53S3TerraformVpc
Cloud • Fintech • HR Tech
Support U.S. federal government contracts by managing operations of services. Collaborate with development teams to enhance architecture and ensure service reliability.
Top Skills:
Cloud InfrastructureDistributed SystemsIac ToolsObservabilityProgramming Languages
Information Technology
As a Site Reliability Engineer, you will enhance system resilience and efficiency, automate tasks, and support software applications for the Intelligence Community.
Top Skills:
AgileArgocdBitbucketCi/CdElasticsearchGitlabJava SpringbootKafkaKubernetesMongoDBNifi
Gaming
Manage operational tasks for gaming services, design runtime environments, monitor metrics, optimize architecture, and research software solutions.
Top Skills:
C/C++GoIstioJavaK8SLinuxMySQLNginxPythonRustShell
Security • Software
The role involves architecting and leading deployment automation, managing SaaS reliability, guiding teams on cloud tools, and responding to incidents.
Top Skills:
AnsibleAWSCloudFormationCloudwatchDatadogGrafanaHelmKubernetesOpensearchPager DutyPythonSaltTerraform
Energy
As a Site Reliability Engineer, you will design resilient systems, manage incident responses, and implement automation for infrastructure management, ensuring high operational standards.
Top Skills:
AWSAzureBashGCPGoKubernetesPythonTerraform
Fintech • Information Technology • Payments
The Staff Site Reliability Engineer will develop on the ServiceNow platform CMDB, manage data, support operations, and enhance system functionality.
Top Skills:
CmdbData GovernanceETLItom DiscoveryMicrosoft SsisOdbcRest ApisServicenowSQL
Artificial Intelligence • Big Data • Information Technology • Security • Software
The Site Reliability Engineer ensures operational excellence in a telecommunication solution on the public cloud, handling automation, incident management, performance planning, and security collaboration.
Top Skills:
AnsibleAWSDatadogDockerGCPGitlabHelmJavaJenkinsKubernetesNoSQLTerraform
Big Data • Cloud • Information Technology
The Site Reliability Engineer at Iron Mountain will troubleshoot escalated tickets, manage Windows Server builds, perform security patching, and collaborate with customers and vendors to resolve issues and maintain systems.
Top Skills:
CloudComputeHyper-Converged InfrastructureLinuxMicrosoft Endpoint Configuration ManagerNetworkNutanixPowershellRubrikStorageVirtualizationWindows Server
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
Software
As a Site Reliability Engineer, you will enhance system reliability, manage cloud services, respond to incidents, and support network systems.
Top Skills:
AutomationCisco RoutingCloud ServicesF5 Load BalancingFortinet FirewallsInfrastructure AutomationMonitoringNetworking
Aerospace
The Site Reliability Engineer ensures system reliability, partners with development teams to improve operational quality, and leads incident resolution. Responsibilities include incident response, toil reduction, reliability evaluations, and platform enablement.
Top Skills:
ArgocdDockerGitGitlabGitopsGoGrafanaJenkinsKubernetesPrometheusPython
Reposted YesterdaySaved
Fintech • Financial Services
As a Site Reliability Engineer, you'll ensure the reliability of Remote Access platforms, develop strategies for issue detection, and support IT service management processes.
Top Skills:
AzureAzure RunbooksEntra IdItilv4LinuxPowershellPythonSAMLSaseScimServicenowSplunk
AdTech • eCommerce • Food • Marketing Tech • Retail
The Senior Site Reliability Engineer is responsible for ensuring production system reliability, scalability, and performance through automation, monitoring, and infrastructure engineering. The role includes mentoring junior engineers and managing production environments, while collaborating with engineering teams to improve system resilience.
Top Skills:
AksArgocdAWSAzureBashDatadogDockerElkGCPGithub ActionsGoJavaKafkaKubernetesPrometheusPythonRedisSpring BootTerraformTomcat
AdTech • eCommerce • Food • Marketing Tech • Retail
The Senior Site Reliability Engineer is responsible for ensuring production systems' reliability, scalability, and performance through automation, observability, and infrastructure engineering.
Top Skills:
AksArgocdBashDatadogDockerElkGithub ActionsGoJavaKafkaKubernetesPrometheusPythonRedisSpring BootTerraformTomcat
AdTech • eCommerce • Food • Marketing Tech • Retail
Responsible for maintaining and improving the reliability of production systems through automation, monitoring, and incident response in a cloud-native environment, while mentoring junior engineers.
Top Skills:
AksArgocdBashDatadogDockerElkGithub ActionsGoJavaKafkaKubernetesPrometheusPythonRedisTerraformTomcat
Security
The Director of DevSecOps and SRE will lead teams in SRE, Cloud Infrastructure, and DevOps practices, focusing on automation, infrastructure reliability, and security policies while mentoring engineers and managing software projects.
Top Skills:
Aws Cloud TechnologiesGitlabGrafanaJavaKubernetesLokiMaterial UiPostgresPrometheusRabbitMQReactReduxSentrySpringTailwindTerraform
Reposted YesterdaySaved
eCommerce • Fintech • Payments
The role focuses on ensuring system reliability, performance, and efficiency by applying software engineering practices to operations tasks, including monitoring and troubleshooting systems.
Top Skills:
AnsibleDatadogGCPJenkinsKubernetesLogstashLooker StudioSplunkTerraformThousand Eyes
Fintech • Financial Services
The role involves creating a digital strategy, enhancing CI/CD pipelines, managing cloud infrastructure, ensuring platform reliability, and leading DevOps practices. Requires collaboration, mentorship, and adherence to compliance in technology services.
Top Skills:
App InsightsArmAWSAzureAzure DevopsBashCi/CdCloudFormationDockerDynatraceElkGCPGithub ActionsGrafanaJenkinsKubernetesNew RelicOraclePrometheusPythonSplunkSQLTerraform
Artificial Intelligence • Robotics • Automation • Manufacturing
Responsible for managing and setting up internal systems infrastructure, migrating SaaS to self-hosted solutions, implementing monitoring systems, and ensuring security compliance.
Top Skills:
AnsibleAWSAzureCloudFormationDatadogDnsGCPGrafanaHTTPLinux/UnixPrometheusTcp/IpTerraform
Cloud • Information Technology • Security • Software
Design, build, and operate network infrastructure for cloud and on-prem environments, ensuring reliability, scalability, and security through automation and observability.
Top Skills:
AnsibleAws VpcAzure VnetsBgpDnsElkEnvoyFirewallsGcp VpcGoGrafanaNginxOpentelemetryOspfPrometheusPythonTcp/IpTerraformTransit GatewayVlans
Cloud • Information Technology • Security • Software
The role involves designing, building, and operating infrastructure systems, focusing on automation, reliability, and security for cloud and on-prem environments while collaborating closely with engineering teams.
Top Skills:
AnsibleBashCi/CdCloudFormationDockerElkGoGrafanaKubernetesLinuxOpentelemetryPrometheusPythonTerraform
Blockchain
The Blockchain Site Reliability Engineer is responsible for maintaining blockchain nodes' reliability, monitoring, incident response, and building automation tools to enhance operations.
Top Skills:
DockerElkGoGrafanaJavaScriptKubernetesLinuxPrometheusPythonRustShell
Top Companies Hiring Senior Site Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results
.png)






























