Top Site Reliability Engineer Jobs
As a Principal Site Reliability Engineer in the Datastores team, you will ensure the reliability and scalability of mission-critical datastores, drive automation for operational excellence, and collaborate with cross-functional teams to shape architecture and strategy. This role involves mentoring team members and designing systems optimized for availability and performance.
As a Systems Reliability Engineer, you'll build and maintain the Edge platform across a global network, focusing on automation, scalability, and operational excellence. Your responsibilities include managing service availability, developing tools for performance improvement, and leveraging monitoring tools while enhancing platform capabilities. You'll utilize your coding skills in Go or Python, alongside your knowledge of Linux and networking protocols.
As a Database SRE, you will design, build, and maintain the database infrastructure for a globally distributed trading system. Responsibilities include configuring PostgreSQL and MySQL servers, improving query performance, implementing monitoring solutions, and applying infrastructure-as-code tools. You will collaborate with a focused team to enhance database infrastructure supporting real-time workflows.
As a Staff Site Reliability Engineer, you will ensure the reliability of data infrastructure, collaborate on automation, monitoring, and compliance, while leading efforts to maintain data performance and availability.
As a Site Reliability Engineer at Alloy, you'll architect and build infrastructure solutions to improve uptime, provision AWS resources using Terraform, and automate deployments. You will also focus on system resilience, implement monitoring tools, and support application developers in maintaining a smooth deployment pipeline.
The Database SRE will design, build, and maintain HRT's SQL and NoSQL database infrastructure, focusing on performance and scalability. Responsibilities include configuring PostgreSQL and MySQL servers, working with infrastructure-as-code tools, and optimizing database performance with various monitoring tools.
The role involves supporting and diagnosing issues in a distributed environment, capacity planning, and application migrations. Applicants must be knowledgeable in UNIX/Linux and have experience in Python and SQL. Strong communication skills and the ability to work under pressure are essential.
As a Site Reliability Engineer at Peloton, you will create and manage tooling and services for infrastructure, automate processes, and ensure system reliability. Your role will include best practices advocacy, incident retrospectives, and collaboration with various teams to enhance developer experience.
Featured Jobs
The Site Reliability Engineer at Citadel will focus on ensuring the reliability and performance of applications, automating tasks, resolving systemic issues, and collaborating with various engineering teams. Responsibilities also include incident management, improving systems operationally, and promoting the SRE mindset across teams.
As a Senior DevOps/SRE Engineer, you will design, implement, and maintain scalable infrastructure on Google Cloud Platform. Responsibilities include managing CI/CD pipelines, automating tasks with IaC tools like Terraform, and optimizing system performance. You will collaborate with development teams to ensure reliability and security of cloud resources, while also mentoring junior engineers.
As a Sr Lead Site Reliability Engineer at Capital One Shopping, you will lead diverse tech projects, collaborate with product managers, and deliver cloud-based solutions. You will mentor developers, stay updated on tech trends, and may also be involved in coding and code evaluation. The role focuses on ensuring technical reliability and performance for financial empowerment technologies.
As a Site Reliability Engineer focused on Environment Automation, you will automate workflows for provisioning and maintaining a large number of GitLab environments. Responsibilities include monitoring system performance, responding to user emergencies, and enhancing security measures while collaborating with engineering teams on architecture.
As a Lead Site Reliability Engineer, you will architect, build, and support Linux infrastructure, design scalable orchestration platforms, influence architectural decisions with a focus on security and performance, mentor team engineers, and contribute to engineering standards. You'll work on automating systems and handle complex technical issues while supporting public cloud technologies.
The Lead Site Reliability Engineer will design, develop, and operate a large-scale cloud platform, ensuring services are reliable and performant. Responsibilities include collaborating with engineering teams, managing cloud costs, automating compliance reporting, and establishing security guardrails for cloud adoption.
As a Lead Site Reliability Engineer, you will focus on enhancing system operations by building foundational services, improving scalability, and collaborating with product and engineering teams to boost service reliability and performance. Your responsibilities include shipping services, eliminating bottlenecks, implementing architectural improvements, and championing best practices across teams.
As a Lead Site Reliability Engineer at Klaviyo, you will design and develop systems to enhance the availability, scalability, and performance of Klaviyo's services. You will collaborate across teams, implement architectural improvements, and take part in on-call duties to maintain system reliability. Your goal is to empower product teams to deliver high-quality software efficiently.
The Platform SRE II role involves architecting and developing software frameworks for Grubhub's cloud platform, ensuring they are testable and fault tolerant. You'll work with engineers to enable API calls, data storage, and job queues, while providing tier-3 support and consulting on infrastructure design.
As a Software Engineer in Test, you'll design and maintain scalable test automation pipelines, perform testing on cloud-based infrastructure, participate in code reviews, create automated test scripts, and collaborate with QA and SRE teams. You'll also work on performance testing and automation of reports.
The Sr. Engineering Manager will lead the Cloud Infrastructure and Site Reliability Engineering (SRE) teams, focusing on optimizing cloud resources, managing incident responses, and maintaining operational practices. Responsibilities include mentoring staff, overseeing technical decisions, ensuring seamless deployments, and conducting post-incident reviews.
As a Principal Site Reliability Engineer, you will design and optimize cloud infrastructure, enhance monitoring, and implement CI/CD pipelines. You'll collaborate with engineering and operations teams to ensure security, reliability, and scalability while mentoring junior engineers in best practices.
The Site Reliability Engineer will install and maintain Anduril’s software solutions for mission-critical capabilities, support deployment operations, and ensure system reliability. Responsibilities include configuration, reliability engineering, software tooling improvements, and customer engagement.
As a Site Reliability Engineer in Cloud Operations, you will design and support cloud-based IT solutions, focusing on infrastructure, security, and scalability. Your responsibilities include leading automation efforts, integrating legacy systems, and developing CI/CD frameworks. You'll guide teams on best practices, assist in monitoring implementation, and provide technical leadership across departments.
The Site Reliability Engineer will manage and support the operation of satellite ground systems, focusing on automation and infrastructure management. Responsibilities include developing automation for system operations, ensuring service reliability, coordinating with software development teams, and enhancing operational solutions through automation.
The Site Reliability Engineer will focus on DevSecOps deployment automation and operate multi-tenant environments across various infrastructures including AWS GovCloud and VMware. Responsibilities include building and securing environments, collaborating with Cybersecurity and Development teams, and maintaining full CI/CD toolchains, particularly in classified settings.
As a Lead Site Reliability Engineer at JPMorgan Chase, you will lead site reliability initiatives, improve application and platform stability, mentor other engineers, and address technical issues during incidents. You are responsible for analyzing performance metrics and implementing best practices in site reliability engineering.
Top Companies Hiring Site Reliability Engineers
See AllPopular Job Searches
All Software Engineer Jobs
.NET Developer Jobs
Aerospace Thermal Engineering Jobs
AI Engineer Jobs
Android Developer Jobs
Automation Engineer Jobs
Backend Developer Jobs
Blockchain Developer Jobs
C# Jobs
C++ Jobs
Cloud Architect Jobs
Cloud Engineer Jobs
Design Engineer Jobs
DevOps Engineer Jobs
Director Of Engineering Jobs
Electrical Engineering Jobs
Embedded Software Engineer Jobs
Engineering Jobs
Engineering Manager Jobs
Environmental Engineering Jobs
Field Engineer Jobs
Front End Developer Jobs
Full Stack Developer Jobs
Game Developer Jobs
Golang Jobs
Hardware Engineer Jobs
Industrial Engineering Jobs
iOS Developer Jobs
Java Developer Jobs
Javascript Developer Jobs
Linux Jobs
Manufacturing Engineer Jobs
Mechanical Engineering Jobs
Network Engineer Jobs
PHP Developer Jobs
Process Engineer Jobs
Project Engineer Jobs
Prompt Engineering Jobs
Python Jobs
QA Jobs
Robotics Engineer Jobs
Ruby on Rails Jobs
Salesforce Administrator Jobs
Salesforce Developer Jobs
Scala Jobs
Sharepoint Developer Jobs
Site Reliability Engineer Jobs
Software Engineering Manager Jobs
Solutions Architect Jobs
SQL Developer Jobs
Structural Engineer Jobs
System Engineer Jobs
Test Engineer Jobs
Web Developer Jobs
All Filters
No Results
No Results