Get the job you really want.
Maximum of 25 job preferences reached.
Top Senior Site Reliability Engineer Jobs
Healthtech • Insurance
The Senior Software Engineer will lead technical projects in cloud infrastructure, mentoring teams, improving DevOps practices, and ensuring system resilience.
Top Skills:
AWSCi/CdGCPGithub ActionsGrafanaIamIstioKubernetesPrometheusTerraformVpc Peering
Healthtech • Insurance
The Senior Software Engineer will lead complex projects, mentor engineers, and ensure cloud infrastructure is resilient and automated. Responsibilities include developing software, managing production environments, and enforcing coding standards.
Top Skills:
ArgocdAWSGCPGithub ActionsGrafanaIstioKubernetesPrometheusTerraform
Healthtech • Insurance
The Senior Software Engineer will lead cloud infrastructure projects, mentor junior engineers, ensure system reliability, and drive technical roadmaps.
Top Skills:
AWSCi/CdGCPGithub ActionsGrafanaIstioKubernetesPrometheusTerraform
Healthtech • Insurance
The Senior Software Engineer will lead technical projects, mentor engineers, and build resilient cloud infrastructures focusing on SRE best practices.
Top Skills:
AWSCi/CdGCPGithub ActionsGrafanaKubernetesPrometheusTerraform
Healthtech • Professional Services • Software
Lead design, architecture, and reliability of scalable systems; own incident response, monitoring, and CI/CD automation. Mentor engineers, drive tooling and AI adoption, and collaborate across teams to meet business needs and maintain high system availability.
Top Skills:
Argo CdAzure Devops PipelinesCi/CdElk StackGCPGrafanaIstioKubernetesNew RelicOpentelemetryTerraform
Automotive • Hardware • Logistics
The Site Reliability Engineer III enhances system reliability by building automation and supporting large-scale systems, ensuring critical platforms function optimally.
Top Skills:
APIsAzure DevopsDynatraceGoogle Cloud PlatformGrafanaHTTPJavaKubernetesMicroservicesPrometheusTerraform
Artificial Intelligence • Software
As a Software Engineer on the Site Reliability team, you'll ensure system reliability, scalability, and observability while partnering with engineering teams and improving incident management processes.
Top Skills:
AWSCi/Cd ToolingContainer OrchestrationDatadogGrafanaPrometheusTerraform
Insurance • Cybersecurity
This role involves leading AI enablement, developing tools for AI-assisted development, and ensuring reliable, secure production environments. Responsibilities include integrating AI tools into workflows and mentoring engineering teams.
Top Skills:
Ai-Assisted Development ToolsAWSCi/Cd ToolsCursorDatadogEcsGithub ActionsGithub CopilotGoKubernetesPythonTerraform
Artificial Intelligence
The Site Reliability Engineer II will enhance infrastructure and software reliability, write efficient code, collaborate across teams, and maintain platforms and monitoring tools.
Top Skills:
AWSCi/CdCoralogixDockerJavaScriptKubernetesPythonSentryTerraformUnix Shell
Artificial Intelligence
In this role, the Site Reliability Engineer will improve reliability and performance of infrastructure, write clean code, collaborate across teams, and maintain platforms for deployed software.
Top Skills:
AWSCi/CdDockerJavaScriptKubernetesPythonTerraformUnix Shell
Database
The Site Reliability Engineer will oversee the Digital Realty interconnection fabric network infrastructure, focusing on network operations, automation, and development. Responsibilities include maintaining global network infrastructure, responding to alerts, and working with various cloud platforms and automation tools.
Top Skills:
AnsibleAWSAzureGitGCPIbm CloudJenkinsLinuxOracle CloudPythonTerraform
Cloud • Information Technology • Security • Software
The Senior Manager will lead the SRE and DevOps teams, manage software engineering, collaborate on cloud infrastructure, and drive innovation, ensuring resilience and quality.
Top Skills:
Amazon Web ServicesApmCCi/CdCloudFormationEc2ElbGitGoGrafanaJavaJenkinsKubernetesPagerdutyPythonS3SpinnakerSplunkVpc
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
Other
In this role, you will manage day-to-day operations of Internet-based enterprise systems, identify operational issues, develop tools for maintenance, and collaborate on infrastructure documentation and project execution.
Top Skills:
.NetAnsibleApacheAzureChefIisJbossPerlPowershellPuppetPythonRubyTomcat
Other
Responsible for monitoring, provisioning, and customer interactions, with a focus on maintaining high availability in complex web environments.
Top Skills:
.NetAnsibleApacheCfengineChefDyanatraceGoIisJavaJbossNasNew RelicPerlPowershellPuppetPythonRaidRubySanSplunkSumo LogicTomcatWindows
Cloud • Healthtech • Internet of Things • Machine Learning • Software
Lead the design and implementation of scalable and fault-tolerant infrastructure on AWS and Kubernetes, mentor engineers, and drive operational excellence.
Top Skills:
AWSGoGrafanaJavaKubernetesOpentelemetryPrometheusPythonTerraform
Reposted 17 Days AgoSaved
Easy Apply
Easy Apply
Aerospace • Other
Design, operate, and scale infrastructure for Starlink, developing automation and collaboration with software engineers to enhance product operability and performance.
Top Skills:
AnsibleBashCC++DockerGoKubernetesLinuxPythonTerraform
Reposted 17 Days AgoSaved
Easy Apply
Easy Apply
Aerospace • Other
Design, operate, and scale infrastructure for Starlink's software and network, focusing on Kubernetes and high availability systems.
Top Skills:
AnsibleBashC++GoKubernetesPythonTerraform
Healthtech • Insurance
The Senior Software Engineer, Cloud Infrastructure is responsible for architecting resilient systems on AWS/GCP, leading projects, mentoring engineers, improving software quality, and ensuring compliance with laws and regulations.
Top Skills:
AWSCi/CdGCPGithub ActionsGrafanaIstioKubernetesPrometheusTerraform
Aerospace • Big Data • Greentech • Hardware • Social Impact
The Site Reliability Engineer will build, deploy, and operate computing services for satellite imaging, ensuring reliable and scalable infrastructure while collaborating with cross-functional teams.
Top Skills:
AlloyAnsibleBashCloud-Native InfrastructureGrafanaHelmK3SKubernetesKustomizeOpentelemetryPrometheusProxmoxPythonRke2TalosTerraform
Artificial Intelligence • Cloud • Machine Learning • Software • Database • App development • Generative AI
As a Site Reliability Engineer at Replit, you'll enhance system reliability through observability, automation, incident management, and performance optimization, serving millions globally.
Top Skills:
AnsibleDatadogGoGoogle Cloud PlatformGrafanaKubernetesPrometheusPulumiPythonTerraform
Artificial Intelligence • Cloud • Machine Learning • Software • Database • App development • Generative AI
As a Staff Site Reliability Engineer at Replit, you will ensure infrastructure reliability, drive automation, lead incident management, and mentor the engineering team while enhancing system performance and observability.
Top Skills:
DatadogGoGoogle Cloud PlatformGrafanaKubernetesOpentelemetryPrometheusPythonTerraform
Healthtech • Database
Seeking a Principal Site Reliability Engineer to build a SRE practice, enhance reliability, mentor teams, and drive performance engineering to optimize Quest products and services.
Top Skills:
AnsibleAuroraAWSAzureBigtableCassandraCi/CdCloud Pub/SubCloud SpannerCloud SqlDockerDynamoDBDynatraceGitlabGoGCPJavaJmsKafkaKinesisKubernetesMqPerlPythonRdsRubyShell ScriptingTerraform
Financial Services
As a Principal Site Reliability Engineer, you'll lead a team focusing on observability and automating solutions for cloud and on-prem infrastructures, enhancing reliability and incident response across T. Rowe Price's tech ecosystem.
Top Skills:
.Net CoreAmazon AwsAnsibleElastic StackGoGrafanaJavaMySQLNew RelicNode.jsPostgresPrometheusPythonSolarwinds DpaSplunkSQL ServerTerraformVagrantVault
Artificial Intelligence • Cloud • Social Impact • Software • Wearables
As a Site Reliability Engineer II, you will develop automation workflows and services, manage cloud operations, participate in incident response, and influence architectural patterns for improved efficiency.
Top Skills:
AWSAws CloudformationAzureC#Ci/CdGoJavaKubernetesPythonTemporalTerraform
Fintech • Information Technology • Payments
The Staff Site Reliability Engineer designs and builds cloud-native infrastructure on Azure for data services, ensuring reliability, security, and scalability.
Top Skills:
AutomationAzure Kubernetes ServiceConfiguration ManagementContainer OrchestrationInfrastructure As CodeAzure
Top Companies Hiring Senior Site Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results

























