Get the job you really want

Top Site Reliability Engineer Jobs

295+ Job Results
5 Days Ago
Austin, TX, USA
Remote
440 Employees
Expert/Leader
440 Employees
Expert/Leader
Cannabis • Consumer Web • eCommerce • Software
The Staff Site Reliability Engineer at Weedmaps will design and develop resilient CI/CD and Kubernetes infrastructure, collaborate with engineering teams, mentor other engineers, and influence technical direction across multiple technology stacks. They will be responsible for innovative solutions to improve service reliability and performance while managing critical company initiatives.
Top Benefits:
401-K
401-K Matching
Commuter Benefits
+27 More
5 Days Ago
Framingham, MA, USA
Remote
5,500 Employees
121K-131K Annually
Entry level
5,500 Employees
121K-131K Annually
Entry level
Artificial Intelligence • Gaming • Machine Learning • Software • Virtual Reality • Design • Metaverse
The Site Reliability Engineer will monitor and support complex hosting environments for in-game voice communication, evaluate system performance, and utilize automation tools to manage servers. Responsibilities include project work to develop tools, handling on-site tasks, and participating in on-call rotations to maintain high service levels.
Top Benefits:
401-K
401-K Matching
Adoption Assistance
+54 More
5 Days Ago
USA
Remote
67 Employees
Mid level
67 Employees
Mid level
Information Technology
The SRE Engineer will enhance system reliability and performance through monitoring, automation, incident response, and disaster recovery planning. Key responsibilities include developing monitoring tools, troubleshooting incidents, optimizing performance, and collaborating with development teams to improve system design and deployment.
5 Days Ago
Madisonville, KY, USA
20,258 Employees
Mid level
20,258 Employees
Mid level
Fintech
The Site Reliability Engineer will design and implement solutions for enhancing reliability, improve the bank's key products and services, and provide operational support. The role involves optimizing processes, championing continuous improvement, and acting as a technical leader for Agile teams while coaching other engineering members.
Top Benefits:
401-K
Adoption Assistance
Company Equity
+15 More
6 Days Ago
Naperville, IL, USA
Remote
Hybrid
240 Employees
Mid level
240 Employees
Mid level
Artificial Intelligence • Big Data • Cloud • Information Technology • Machine Learning
Seeking a Site Reliability Engineer to ensure system reliability and infrastructure support, delivering scalability, performance optimization, incident management, and analysis.
6 Days Ago
United States of America
73,000 Employees
Entry level
73,000 Employees
Entry level
Other • Retail
As a Site Reliability Engineer I, you will ensure the availability and performance of Nike's digital experiences by analyzing problems, identifying defects, and collaborating on solutions. Key tasks include implementing monitoring solutions, managing IT service processes, and enhancing application reliability on web and mobile platforms.
6 Days Ago
United States
Remote
8 Employees
Senior level
8 Employees
Senior level
Artificial Intelligence • Marketing Tech • Sales • Software
As a Site Reliability Engineer, you will analyze and enhance system performance, monitor Clickhouse clusters, optimize backup and recovery processes, ensure security, and collaborate with the founders to build AI-driven solutions.
7 Days Ago
West Chester, PA, USA
68,848 Employees
68K-101K Annually
Entry level
68,848 Employees
68K-101K Annually
Entry level
Digital Media • Gaming • Internet of Things • News + Entertainment • Retail • Business Intelligence • Cybersecurity
The SRE/SecOps Engineer will be responsible for ensuring the reliability, security, and operation of the Service Fulfillment and Assurance platform. This includes reviewing and writing code, monitoring systems, and collaborating with quality assurance to meet technical requirements.

Featured Jobs

13 Days Ago
United States
Remote
2,194 Employees
Senior level
2,194 Employees
Senior level
AdTech • Artificial Intelligence • Marketing Tech • Software • Analytics
The Senior SRE Engineer will implement and manage SLOs, SLIs, and error budgets, lead postmortems and root cause analysis, enhance system reliability, and drive automation and observability using modern tools. The role involves collaboration with product teams and engineering strategic initiatives for capacity and reliability.
Top Benefits:
401-K
401-K Matching
Adoption Assistance
+64 More
13 Days Ago
USA
Remote
200 Employees
Senior level
200 Employees
Senior level
Big Data • Healthtech • HR Tech • Machine Learning • Software • Telehealth • Big Data Analytics
The Senior Platform Engineer will architect, operate, and enhance the platform for the Garner Health app, ensuring high performance and security compliance. Responsibilities include boosting developer productivity, collaborating with teammates on strategic initiatives, and supporting the platform in production with a focus on cloud-first projects.
Top Benefits:
401-K
Commuter Benefits
Company Equity
+24 More
7 Days Ago
United States
Remote
1,050 Employees
204K-281K Annually
Expert/Leader
1,050 Employees
204K-281K Annually
Expert/Leader
Information Technology • Security • Cybersecurity
As a Principal Site Reliability Engineer, you will lead the implementation of advanced observability and automated systems within a microservices-based SaaS environment. You'll collaborate with teams to define and monitor SLOs, establish reliability standards, and mentor engineers to drive reliability engineering excellence.
Top Benefits:
401-K
Commuter Benefits
Company Equity
+46 More
7 Days Ago
2 Locations
Remote
200 Employees
Senior level
200 Employees
Senior level
Software
As a Senior Platform Engineer at Mux, you will design and operate the infrastructure for Mux's platforms, focusing on scalable systems and CI/CD processes. You'll improve platform usability via automation, lead cross-functional projects, debug production issues, and promote engineering standards and best practices.
Top Benefits:
401-K
Commuter Benefits
Company Outings
+24 More
8 Days Ago
2 Locations
68,848 Employees
113K-185K Annually
Mid level
68,848 Employees
113K-185K Annually
Mid level
Digital Media • Gaming • Internet of Things • News + Entertainment • Retail • Business Intelligence • Cybersecurity
The Cloud Site Reliability Engineer will design and build scalable infrastructure, maintain cloud governance for AWS, write automation scripts, support incident resolution, and collaborate with development teams to optimize system performance. They will also document processes and help implement security measures.
8 Days Ago
TX, USA
Remote
1,637 Employees
Senior level
1,637 Employees
Senior level
Cloud • Information Technology • Other • Software
The Site Reliability Engineer III is responsible for designing, developing, and optimizing systems for reliability and performance. This role involves implementing tools to measure system health, guiding engineering teams in observability practices, and improving operational processes. The engineer will proactively address production issues and provide technical leadership, mentoring other staff as needed.
Top Benefits:
401-K
401-K Matching
Adoption Assistance
+59 More
8 Days Ago
8 Locations
Remote
4,900 Employees
66K-87K Annually
Entry level
4,900 Employees
66K-87K Annually
Entry level
Fintech • Payments
As an entry-level Site Reliability Engineer, you will learn SRE principles, assist in system design and incident response, and support automation and tooling with a focus on improving system reliability. You will collaborate with development teams, engage in capacity planning, and understand security compliance while gaining practical experience in a technical role.
Top Benefits:
401-K
Adoption Assistance
Company Equity
+18 More
8 Days Ago
8 Locations
Remote
4,900 Employees
156K-208K Annually
Senior level
4,900 Employees
156K-208K Annually
Senior level
Fintech • Payments
The Senior Staff Site Reliability Engineer will lead and mentor SRE teams, design and implement scalable systems, optimize performance, manage incident responses, and ensure compliance and security within the organization. They will also focus on automation tools and collaborate closely with software development teams.
Top Benefits:
401-K
Adoption Assistance
Company Equity
+18 More
8 Days Ago
United States
Remote
259 Employees
165K-210K Annually
Senior level
259 Employees
165K-210K Annually
Senior level
Cloud • Payments
As a Staff Site Reliability Engineer at VGS, you will architect and maintain scalable cloud infrastructure, lead incident management, optimize performance, and collaborate with cross-functional teams to enhance system reliability. You will also advocate for best practices and mentor junior engineers while driving continuous improvement efforts.
8 Days Ago
Portland, OR, USA
35 Employees
Mid level
35 Employees
Mid level
Software
As a Site Reliability Engineer, you will manage and enhance AWS infrastructure, optimize Kubernetes clusters, develop Infrastructure as Code with Terraform, improve CI/CD pipelines, and ensure system security and performance monitoring. You will collaborate with teams to resolve issues and improve application reliability.
8 Days Ago
Dallas, TX, USA
2,760 Employees
Junior
2,760 Employees
Junior
Cloud • Information Technology
The Site Reliability Engineer ensures high availability and performance of OVHcloud products, manages infrastructure, diagnoses errors, automates tasks with scripting, and participates in software development and monitoring. They collaborate in on-call rotations and provide support for newly developed products and services.
8 Days Ago
United States
Remote
Mid level
Mid level
Software
As a Site Reliability Engineer at Vercel, you'll enhance Edge infrastructure, manage incident responses, and integrate SRE practices into engineering processes. You'll focus on improving reliability, performance, and efficiency while developing automated systems for software delivery and capacity management.
8 Days Ago
United States
Remote
410 Employees
Mid level
410 Employees
Mid level
Software
As a Site Reliability Engineer, you will manage production infrastructure on AWS and Azure, ensuring high availability and performance. You'll automate alerts, collaborate with R&D for scalable solutions, and document processes for repeatability. Your responsibilities include troubleshooting incidents, monitoring system observability, and conducting on-call duties.
8 Days Ago
United States
Remote
197 Employees
Senior level
197 Employees
Senior level
Healthtech
As a Senior Site Reliability Engineer at Sword Health, you will maintain service health, develop automation tools, optimize system performance, ensure security compliance, manage databases, and share knowledge within the team.
8 Days Ago
Rocklin, CA, USA
2,000 Employees
Entry level
2,000 Employees
Entry level
Information Technology • Machine Learning • Software • Analytics • Business Intelligence • App development • Generative AI
As a Site Reliability Engineer at Nisum, you will provide Level 2/3 support for eCommerce applications, analyze root causes of production issues, collaborate with teams to ensure application stability, monitor application performance, document support activities, and participate in on-call support.
8 Days Ago
Austin, TX, USA
1,500 Employees
120K-297K Annually
Junior
1,500 Employees
120K-297K Annually
Junior
Social Media • Software
As a Site Reliability Engineer, you will be responsible for maintaining and enhancing the performance and reliability of large-scale HPC and AI/ML systems, managing clusters, automating deployments, troubleshooting issues, and collaborating with cross-functional teams to support infrastructure.
Top Benefits:
401-K
401-K Matching
Child Care Benefits
+37 More
8 Days Ago
Denver, CO, USA
7,000 Employees
107K-153K Annually
Senior level
7,000 Employees
107K-153K Annually
Senior level
Artificial Intelligence • Cloud • Events • Productivity • Software • Business Intelligence • Conversational AI
The Site Reliability Engineer (SRE) at RingCentral is responsible for maintaining and improving service reliability and availability. Duties include integrating monitoring solutions, implementing failover mechanisms, conducting risk assessments, and responding to incidents in a collaborative environment. Experience with observability platforms, containerization, and programming is essential.
Top Benefits:
401-K
401-K Matching
Child Care Benefits
+47 More
All Filters
Date Posted
Job Category
Experience
Industry
Company Name
Company Size