Top Site Reliability Engineer Jobs
The Senior Site Reliability Engineer will lead projects to enhance platform performance, reliability, and security using SRE methodologies. Responsibilities include troubleshooting complex service issues, automating monitoring, and optimizing systems. The role also involves leading deployments, managing observability tools, and guiding lower-level engineers. The ideal candidate will ensure documentation is up-to-date and facilitate post-incident reviews to improve overall system reliability.
The Senior SRE will support global product deployment, provide engineering support, and manage infrastructure monitoring. Responsibilities include maintaining CI/CD tooling, enhancing operational documentation, collaborating with engineering teams, and ensuring security and compliance in production environments.
The Sr. Site Reliability Engineer at CDK Global will manage solutions and cloud infrastructures, ensuring reliability, scalability, and performance of enterprise-grade systems. Responsibilities include improving solution lifecycles, troubleshooting distributed system issues, and collaborating with cross-functional teams to maintain reliability standards.
The Senior Site Reliability Engineer at Greenlight will enhance system reliability and performance, manage incidents, and define service levels. Responsibilities include monitoring metrics, collaborating with product and platform teams, and improving internal tools for application delivery in a cloud-native environment.
The Senior Site Reliability Engineer will ensure the health and performance of applications through automation, monitoring, and collaboration with developers. Responsibilities include addressing production issues, optimizing processes for reliability, and adhering to security practices. On-call support and capacity planning are also key aspects of the role.
In this role, you will ensure the reliability and efficiency of Greenlight's cloud-native systems by monitoring services, enhancing platforms, and responding to production incidents. You will facilitate communication between teams and bring improvements to the incident management culture, ensuring service level objectives are met.
The Senior Site Reliability Engineer at Cognite will work cross-functionally to enhance service reliability across the tech stack. Responsibilities include optimizing cloud performance using GCP, AWS, and Azure, developing monitoring tools, collaborating with product teams, and educating others on reliability best practices.
The Senior Site Reliability Engineer at Zoox ensures high uptime for services critical to autonomous vehicle development. Responsibilities include designing fault-tolerant systems, deploying and operating services, supporting infrastructure, and analyzing performance metrics, leveraging tools like Kubernetes and AWS.
As a Senior Site Reliability Engineer at Coinbase, you will build automation, enhance systems, improve observability and reliability, and collaborate with infrastructure teams to optimize cloud performance. You will also level up engineering practices by sharing knowledge and advocating for reliability across teams.
As a Senior Site Reliability Engineer, you will build and maintain the data infrastructure for the Wikimedia Foundation's fundraising program, implementing data quality monitoring, managing distributed data systems, and collaborating with various teams to enhance data usage and compliance with regulations.
As a Senior Site Reliability Engineer, you will design and optimize applications and CI/CD pipelines, support systems through incident management, and interface with development teams to enhance performance and scalability. You'll be involved in troubleshooting and providing strategic guidance on best practices for application infrastructure and pipeline management.
The Senior Site Reliability Engineer will support and monitor MachineFi's infrastructure, deploy and maintain services, improve internal reliability, provide release pipelines for engineers, debug production issues, and ensure efficient operations across service levels.
The Senior Site Reliability Engineer will build and maintain the data infrastructure for the Wikimedia Foundation's fundraising program, implementing data quality monitoring and collaborating with teams for data integration. Responsibilities include managing distributed data systems, ensuring compliance with regulations, and enhancing data processes.
As a Senior SRE/DevOps Engineer, you will own the application stack and AWS infrastructure, debug runtime issues, develop internal tooling for managed Metabase installations, and improve automated deployments and testing.
The Senior Site Reliability Engineer will design and build global cloud service infrastructure, troubleshoot automation and monitoring across multiple cloud providers, optimize performance from application to firmware, participate in on-call rotations, and improve capabilities focusing on cost and maintainability.
The Senior Site Reliability Engineer will design and build global cloud infrastructure for MongoDB services, focusing on automation, monitoring, performance optimization, and resilience while participating in on-call rotations. They will work with a diverse Cloud Team to ensure service reliability and address challenges related to latency, data sovereignty, and cost efficiency.
Seeking an experienced Director of Cloud/Infrastructure Operations to oversee strategic direction, planning, and execution of cloud and infrastructure operations for high availability, scalability, and performance of IT systems. Responsibilities include gathering and analyzing metrics, improving services, system design consulting, automation, and balancing feature development and reliability. Bachelor's degree in computer science required with experience in cloud services and distributed storage technologies preferred. Competitive salary, benefits, and equity offered.
As a Senior Site Reliability Engineer, you will build and expand Box's service mesh infrastructure, ensuring reliability, scalability, security, and performance across services. Collaborating with different teams, you'll solve technical challenges in a multi-cloud environment using technologies like Envoy, Istio, and Kubernetes.
As a Senior Site Reliability Engineer, you will lead initiatives for cloud infrastructure, deploy software solutions, partner with developers to promote best practices, and participate in monitoring and on-call rotations to ensure effective service management. Your role will guide the evolution of Zipline's technical capabilities.
As a Senior Site Reliability Engineer, you will lead DevOps efforts, operating and automating systems critical to mission control and software management. Responsibilities include maintaining Kubernetes clusters, implementing monitoring systems, and managing deployment practices. You will work across teams to ensure the reliability and scalability of software systems.
As a Senior Site Reliability Engineer, you'll design and build scalable, reliable systems, improve the development lifecycle, and ensure system reliability. You'll collaborate with product engineers, lead incident response efforts, and utilize advanced tools for monitoring and performance optimization.
The Senior Site Reliability Engineer will design, implement, and maintain cloud infrastructure using IaC practices, work with development teams for end-to-end solutions, ensure system reliability and uptime, and implement monitoring and logging solutions while collaborating with various teams.
As a Senior Site Reliability Engineer at Paxos, you will ensure the reliability and scalability of cloud infrastructure, manage databases like RDS and Aurora, optimize AWS technologies, develop automation scripts, and collaborate with development teams for seamless feature deployment. You'll also implement security practices and conduct root cause analyses while providing on-call support.
The Senior Site Reliability Engineer will design and build infrastructure for a global cloud service, optimizing for performance and resilience. Responsibilities include automating services, monitoring system health, and participating in an on-call rotation to ensure mission-critical systems operate effectively and efficiently.
As a Senior Site Reliability Engineer at Macrometa, you will maintain, secure, and scale Kubernetes-based infrastructure while collaborating with development teams. Your role involves managing microservices in Docker containers, developing Kubernetes operators, and engaging in incident management.
Top Companies Hiring Site Reliability Engineers
See AllPopular Job Searches
All Software Engineer Jobs
.NET Developer Jobs
Aerospace Thermal Engineering Jobs
AI Engineer Jobs
Android Developer Jobs
Automation Engineer Jobs
Backend Developer Jobs
Blockchain Developer Jobs
C# Jobs
C++ Jobs
Cloud Architect Jobs
Cloud Engineer Jobs
Design Engineer Jobs
DevOps Engineer Jobs
Director Of Engineering Jobs
Electrical Engineering Jobs
Embedded Software Engineer Jobs
Engineering Jobs
Engineering Manager Jobs
Environmental Engineering Jobs
Field Engineer Jobs
Front End Developer Jobs
Full Stack Developer Jobs
Game Developer Jobs
Golang Jobs
Hardware Engineer Jobs
Industrial Engineering Jobs
iOS Developer Jobs
Java Developer Jobs
Javascript Developer Jobs
Linux Jobs
Manufacturing Engineer Jobs
Mechanical Engineering Jobs
Network Engineer Jobs
PHP Developer Jobs
Process Engineer Jobs
Project Engineer Jobs
Prompt Engineering Jobs
Python Jobs
QA Jobs
Robotics Engineer Jobs
Ruby on Rails Jobs
Salesforce Administrator Jobs
Salesforce Developer Jobs
Scala Jobs
Sharepoint Developer Jobs
Site Reliability Engineer Jobs
Software Engineering Manager Jobs
Solutions Architect Jobs
SQL Developer Jobs
Structural Engineer Jobs
System Engineer Jobs
Test Engineer Jobs
Web Developer Jobs
All Filters
No Results
No Results