Get the job you really want.

Top Site Reliability Engineer Jobs

Reposted 5 Days AgoSaved
In-Office
Austin, TX, USA
125K-181K Annually
Senior level
125K-181K Annually
Senior level
Fintech • Information Technology • Payments
The Staff Site Reliability Engineer will be responsible for DevOps on payment systems, production development, debugging code, data pipeline monitoring, and ensuring data integrity while guiding junior team members.
Top Skills: Amazon RedshiftAWSAzureCassandraGCPGoogle BigqueryHadoopJavaKafkaMongoDBMySQLPostgresPythonScalaSnowflakeSparkSQL
Reposted 5 Days AgoSaved
In-Office
18 Locations
Senior level
Senior level
Artificial Intelligence • Information Technology • Software
The Senior Site Reliability Engineer at BentoML will manage infrastructure for AI services, focusing on Kubernetes, Terraform, GPU clusters, and observability tools, while mentoring and driving SRE best practices.
Top Skills: Amd GpuAWSAzureCi/CdGitopsGCPGrafanaKubernetesNvidia GpuOracle CloudPrometheusPulumiTerraform
Reposted 5 Days AgoSaved
In-Office
Austin, TX, USA
175K-240K Annually
Senior level
175K-240K Annually
Senior level
Financial Services
The Staff Engineer will support and optimize messaging platforms, design solutions to improve operational efficiency, and collaborate with teams on business-focused solutions.
Top Skills: AmpsAWSEksFixJavaKafkaKubernetesLinuxMqSpringSQL
Reposted 5 Days AgoSaved
Remote
USA
160K-180K Annually
Senior level
160K-180K Annually
Senior level
Software • Database
The Senior Site Reliability Engineer will manage AWS infrastructures, improve CI/CD pipelines, and assist teams with scaling solutions. Responsibilities include overseeing logging, monitoring, and high-quality software development with strong security and reliability considerations.
Top Skills: AnsibleAWSChefCloudFormationDatadogDockerDynamoElasticsearchGithub ActionsMySQLOpensearchPostgresPuppetPythonRedisS3Terraform
Reposted 5 Days AgoSaved
In-Office
San Francisco, CA, USA
238K-290K Annually
Expert/Leader
238K-290K Annually
Expert/Leader
Artificial Intelligence • Legal Tech • Professional Services • Software
As a Staff Software Engineer in Site Reliability, you'll manage infrastructure for reliability and scalability, lead incident management, and automate operational tasks.
Top Skills: AWSAzureBashCloudFormationDatadogGCPGoIncidentioPagerdutyPulumiPythonSentryTerraform
Reposted 5 Days AgoSaved
In-Office
San Francisco, CA, USA
200K-260K Annually
Mid level
200K-260K Annually
Mid level
Artificial Intelligence • Legal Tech • Professional Services • Software
As a Software Engineer in Site Reliability, you will ensure the reliability and performance of our AI platform through automation and strategic infrastructure management.
Top Skills: AWSAzureBashCloudFormationDatadogGCPGoKubernetesPagerdutyPythonSentryTerraform
Reposted 5 Days AgoSaved
In-Office
Denver, CO, USA
130K-170K Annually
Mid level
130K-170K Annually
Mid level
Artificial Intelligence • Cloud • Information Technology • Mobile • Software • Consulting
The role involves designing and implementing observability solutions using OpenTelemetry, managing platform engineering tasks, and ensuring site reliability through various engineering practices.
Top Skills: AWSAzureCi/CdCloudFormationDockerGCPGoJavaKubernetesNode.jsOpentelemetryPulumiPythonRustTerraform
Reposted 5 Days AgoSaved
Hybrid
Seattle, WA, USA
192K-264K Annually
Senior level
192K-264K Annually
Senior level
Software • Cybersecurity
As a Staff SRE/DevOps Engineer, you'll lead cloud rearchitecture initiatives, drive modernization focusing on availability, and mentor teams on DevOps practices.
Top Skills: ArgocdAWSBashCloudFormationDockerElkGCPGithub ActionsGoGrafanaKubernetesPrometheusPythonTerraform
Reposted 5 Days AgoSaved
In-Office
San Francisco, CA, USA
130K-175K Annually
Junior
130K-175K Annually
Junior
Energy
The Site Reliability Engineer will design and implement systems, drive automation, coordinate between teams, support deployed systems, and ensure scalability for rapid growth.
Top Skills: Active DirectoryAnsibleAWSAzureChefJSONLinuxPuppetPythonRestVMwareWindows ServerYaml
6 Days AgoSaved
Remote
USA
100K-720K Annually
Senior level
100K-720K Annually
Senior level
News + Entertainment
Design and maintain scalable infrastructure, collaborate with teams for reliability, handle incident response, and promote reliability culture.
Top Skills: AWSAzureGCPGoJavaKubernetesPythonTerraform
6 Days AgoSaved
In-Office
Miami, FL, USA
70K-130K Annually
Mid level
70K-130K Annually
Mid level
Computer Vision • Information Technology • Software
The Site Reliability Engineer will enhance system stability, optimize performance, automate deployments, and monitor production systems in both on-premise and cloud environments.
Top Skills: AnsibleAzureAzure Application InsightsBashBicepElk StackGrafanaPowershellPythonTerraform
6 Days AgoSaved
Remote
USA
Senior level
Senior level
Blockchain • Fintech • Financial Services • Cryptocurrency
The VP, Site Reliability Engineer will architect and maintain AWS infrastructure, optimize container workloads, and drive automation and reliability initiatives. Responsibilities include migration from VMs to containers, incident response, and cross-team collaboration.
Top Skills: AWSDatadogEksKubernetesOpentelemetryTerraform
New

Cut your apply time in half.

Use ourAI Assistantto automatically fill your job applications.

Use For Free
Application Tracker Preview
6 Days AgoSaved
In-Office
Austin, TX, USA
125K-181K Annually
Senior level
125K-181K Annually
Senior level
Fintech • Information Technology • Payments
As a Staff Site Reliability Engineer, you'll maintain and support Hadoop, Kafka, and Cloud platforms, ensuring their performance and reliability while driving innovation globally. You'll manage clusters, develop monitoring tools, collaborate on solutions, analyze production incidents, and create procedural documentation.
Top Skills: AnsibleAWSAzureGCPGrafanaHadoopJavaKafkaLinuxPythonSparkSplunk
6 Days AgoSaved
In-Office
Seattle, WA, USA
126K-170K Annually
Mid level
126K-170K Annually
Mid level
Real Estate • PropTech
The role involves enhancing Redfin's reliability through better tools and processes, guiding teams in effective production system operations, and leading educational efforts in reliability engineering.
Top Skills: AWSC++DatadogJavaKubernetesPythonTerraform
Reposted 6 Days AgoSaved
In-Office
Santa Clara, CA, USA
168K-265K Annually
Expert/Leader
168K-265K Annually
Expert/Leader
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Network Site Reliability Engineer will ensure high availability of network infrastructure, work on incident management, implement automation, and drive improvements for operational excellence.
Top Skills: Alert ManagerAnsibleBgpBigpandaData Center Network TechnologiesFirewallsGoGrafanaIpv4Ipv6IsisItilJIRAL2 SwitchingLinuxLoad BalancersNautobotNetboxPrometheusPythonSaltServicenowTcpUdpVpnWireless
6 Days AgoSaved
Remote
United States
Senior level
Senior level
Cloud
The Software Engineer will enhance, optimize, and validate the MinIO cloud-native storage platform while collaborating with customers and the engineering team.
Top Skills: CC++ContainersGoKubernetesMicroservicesRust
Reposted 6 Days AgoSaved
In-Office
Jacksonville, FL, USA
Senior level
Senior level
eCommerce • Fintech • Information Technology • Payments • Software
The Site Reliability Engineer Specialists will enhance software reliability, manage infrastructure processes, and mentor junior engineers while ensuring system performance and compliance.
Top Skills: Akamai Global Traffic ManagementAmazon AwsHarnessJenkinsAzureRestSite Reliability EngineeringXML
Reposted 6 Days AgoSaved
Remote
United States
Mid level
Mid level
Fitness
The Site Reliability Engineer will ensure system reliability and performance, design scalable architectures, improve CI/CD pipelines, maintain infrastructures, and lead incident response efforts.
Top Skills: ArgocdAWSDatadogDockerGithub ActionsGoJavaScriptKubernetesPrometheusPythonTerraform
Reposted 6 Days AgoSaved
In-Office
New York, NY, USA
Senior level
Senior level
Artificial Intelligence
As an Applied AI Engineer, you will onboard customers, deploy AI solutions, work on complex projects, and provide technical guidance. You'll contribute to open-source projects and communicate effectively with stakeholders.
Top Skills: AnsibleAWSAzureDockerGCPKubernetesPythonTerraform
7 Days AgoSaved
In-Office or Remote
5 Locations
120K-236K Annually
Junior
120K-236K Annually
Junior
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
As a Site Reliability Engineer, you'll troubleshoot HPC environments, enhance automation, ensure system reliability, and collaborate to improve chip development processes.
Top Skills: Centos,Rhel,Docker,Python,Bash,Ansible
7 Days AgoSaved
In-Office
Austin, TX, USA
Senior level
Senior level
Digital Media • Social Media • Software • Sports
Lead the technical architecture and execution of migration to AWS, drive developer enablement, and automate infrastructure using code-first principles.
Top Skills: Aws EksDatadogGithub ActionsGoIstioK6KubernetesNode.jsTerraform
7 Days AgoSaved
Remote
United States
200K-250K Annually
Expert/Leader
200K-250K Annually
Expert/Leader
Fintech
As a Staff Site Reliability Engineer, you will shape reliability practices, optimize AWS infrastructure, lead incident response, and mentor engineers.
Top Skills: AWSDatadogGitopsTerraform
7 Days AgoSaved
In-Office
New York, NY, USA
110K-130K Annually
Senior level
110K-130K Annually
Senior level
Financial Services
As a Site Reliability Engineer, you'll optimize and manage cloud infrastructure, implement automation, and maintain system reliability for a global financial platform.
Top Skills: AWSGCPGoHelmKubernetesLinuxPythonTerraform
7 Days AgoSaved
In-Office or Remote
3 Locations
Mid level
Mid level
Healthtech • Software
Monitor application health, respond to incidents, implement Infrastructure as Code, and collaborate with teams to maintain service reliability and performance.
Top Skills: AWSDockerEmberGitMySQLNestjsNode.jsReact
7 Days AgoSaved
In-Office
2 Locations
Mid level
Mid level
Fintech • Software • Financial Services
As a Site Reliability Engineer at Luma, you'll manage AWS infrastructure, Kubernetes clusters, and CI/CD pipelines, ensuring platform reliability and security. You'll also automate processes and lead incident response efforts.
Top Skills: AWSBashCi/CdGoJavaKubernetesPythonTerraform
All Filters
New Jobs
Job Category
Experience
Industry
Company Name
Company Size

Sign up now Access later

Create Free Account