Maximum of 25 job preferences reached.
Top Site Reliability Engineer Jobs
Healthtech • Other • Software
The role involves managing PostgreSQL services, ensuring high availability and performance, driving incident response, automating tasks, and improving observability for a 24x7 SaaS platform.
Top Skills:
AnsibleBashDatadogGrafanaHaproxyNew RelicPgbackrestPgbouncerPostgresPowershellPrometheusPythonRepmgrTerraform
Marketing Tech • Mobile • Software
As a Senior Site Reliability Engineer, you'll maintain and improve the data export system, focusing on observability, reliability, and scalability while guiding junior engineers and adhering to best practices.
Top Skills:
BuildkiteDocker SwarmGitGitlabJavaJenkinsKafkaKotlinKubernetesMongoDBPostgresRubySidekiqSnsSqs
Marketing Tech • Mobile • Software
The Senior Site Reliability Engineer will maintain the Currents data export system, solve reliability issues, mentor junior engineers, and improve system performance and scalability.
Top Skills:
BuildkiteDatadogDockerGitGitlabJavaJenkinsKafkaKotlinKubernetesMongoDBPagerdutyPostgresRubySentrySidekiqSnsSqs
Marketing Tech • Mobile • Software
As a Senior Site Reliability Engineer, you'll maintain and enhance the Currents data export system, focusing on observability, scalability, and reliability, while mentoring junior engineers and solving performance issues.
Top Skills:
BuildkiteDatadogDocker SwarmGitGitlabJavaJenkinsKafkaKotlinKubernetesMongoDBPagerdutyPostgresRubySentrySidekiqSnsSqs
Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
The Senior BizOps Engineer is responsible for ensuring platform stability and resilience, guiding teams in product development, and facilitating operational excellence throughout the software lifecycle.
Top Skills:
ArtifactoryBitbucketCC++ChefDynatraceGitGoJavaJenkinsMavenOraclePerlPl/SqlPostgresPythonRubySplunkSQL
Reposted 8 Days AgoSaved
Cloud • Software
Responsible for maintaining FedRAMP compliant services, designing infrastructure, monitoring systems, and ensuring security for federal regions, while driving automation and collaboration with development teams.
Top Skills:
AWSFedrampGoKubernetesPuppetPythonTerraformUnix/Linux
Information Technology
As a Site Reliability Engineer, you'll build resilient systems by implementing automation, managing infrastructure in cloud environments, and enhancing deployment processes.
Top Skills:
AnsibleAWSBashCi/CdCloudFormationDockerGitGitlabGroovyJSONKubernetesOpenshiftPowershellPythonRestRubyTerraformXML
Cloud • Fintech • HR Tech
This role involves managing AWS resources using IaC, building self-service platforms for developers, and maintaining CI/CD pipelines, along with ensuring system reliability and performance.
Top Skills:
Argo CdAWSCloudFormationCloudwatchDockerElkJenkinsKubernetesPrometheusTeamcityTerraform
Software • Analytics
The role involves automating and managing AWS infrastructure, ensuring reliability and scalability of stateful systems, and optimizing deployment processes. You'll also handle incident responses and improve operational tooling.
Top Skills:
AWSKubernetesTerraformTerragrunt
Gaming
The role involves ensuring production quality, owning system reliability, and participating in decision-making. Responsibilities include incident response and lifecycle management in cloud gaming technologies.
Top Skills:
BashC++ElasticsearchGoIstioJavaKafkaKong Api GatewayKubernetesKumaLinkerdMongoDBMySQLPostgresPythonRedisRust
Artificial Intelligence • Software
As a Site Reliability Engineer at Mercor, you will ensure production reliability, develop SRE function, and collaborate with engineering teams to maintain system performance.
Top Skills:
AWSKubernetesSpaceliftTerraform
Big Data • Machine Learning • Software • Analytics
As a Lead Site Reliability Engineer, you will drive the reliability strategy, improve system health, lead incident management, and mentor engineers for a multi-region SaaS platform.
Top Skills:
ArgocdC++Ci/CdCloud PlatformsDatadogGitopsGrafanaInfrastructure As CodeJavaJavaScriptKubernetesPython
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
Computer Vision • Information Technology • Machine Learning • Natural Language Processing • Real Estate • Software
The SRE will maintain infrastructure for SaaS products on AWS, support developers, manage platform components, and handle IT tasks.
Top Skills:
AWSComputer VisionIacLarge Language ModelsNlpTerraform
Fintech • Analytics
The Site Reliability Engineer will support and automate critical Real Time applications, ensuring service availability and quality across cloud and on-premise deployments, while also collaborating with various teams on operational documentation and incident management.
Top Skills:
AWSAzureDatadogDockerGitKubernetesPythonUnix/Linux
Artificial Intelligence • Other • Sales • Software
The role involves designing and advancing infrastructure for the engineering team, ensuring the reliability of Kubernetes clusters, automating operations, and building machine learning infrastructure.
Top Skills:
ArgoAWSAzureCloudFormationFluxGithub ActionsGoGCPKubernetesPostgresPythonTerraform
Artificial Intelligence • Information Technology • Cybersecurity • Defense
As a Site Reliability Engineer, you'll ensure system reliability in a government environment, manage incidents, and collaborate with engineering teams on operational tasks and improvements while maintaining security compliance.
Top Skills:
AWSBashDockerDocker ComposeGrafanaLinux/UnixLokiMimirPrometheusPythonTerraform
Cloud • Information Technology
The Site Reliability Engineer I is responsible for supporting Backblaze’s infrastructure stability by addressing customer issues, monitoring system health, and improving operational processes through documentation and automation.
Top Skills:
AnsibleLinuxZabbix
Fintech • Financial Services
The role involves developing and delivering software solutions, collaborating cross-functionally, ensuring secure coding practices, managing multi-faceted projects, and mentoring team members.
Top Skills:
FrameworksProgramming LanguagesTools
Other • Energy
The Site Reliability Engineer will build and maintain reliable systems on Google Cloud Platform, automate operations, and improve system performance and reliability.
Top Skills:
AirflowBigQueryCloud MonitoringDataflowDatastreamDockerGithub ActionsGitlab CiGoGoogle Cloud PlatformGrafanaIamJavaKubernetesPrometheusPythonTerraform
Artificial Intelligence • Healthtech • Information Technology • Software
As the first Site Reliability Engineer in the US, you'll ensure platform stability and oversee incident responses during PST hours, bridging infrastructure and code, while improving operability and compliance in a medical-device environment.
Top Skills:
AWSElixirKubernetesTerraform
AdTech • Big Data • Marketing Tech • Software
Responsible for owning and optimizing the Internal Developer Platform, improving reliability, scalability, and usability while supporting engineering teams and standardizing operational processes through automation and best practices.
Top Skills:
ArmAWSAzureBashCloudFormationConsulDockerGithub ActionsHashicorpJenkinsKubernetesLinuxNomadPowershellPythonSplunkSumo LogicTerraformVaultWindows
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
Lead the design and operation of large scale Kubernetes clusters, ensuring high availability and performance while supporting system lifecycle and reliability improvements.
Top Skills:
ContainersGoKubernetesLinuxNetworkingOpenstackPerlPythonRuby
Artificial Intelligence • Information Technology • Software
The role involves defining and evolving technical foundations for AI evaluation, optimizing performance, designing resilient systems, and collaborating with various teams for infrastructure improvements.
Top Skills:
Node.jsPostgresServerless EnvironmentsTypescript
Healthtech
The Senior Software Engineer will enhance system reliability, manage Kubernetes and AWS environments, oversee incident responses, and implement observability measures.
Top Skills:
AWSCloudwatchElbGithub ActionsKubernetesObservability ToolingTerraformVpc
Fintech • Payments • Financial Services
Build, operate, and scale AWS-based infrastructure using IaC (Terraform), manage EKS and serverless environments, create CI/CD pipelines, implement observability (OpenTelemetry/Prometheus/New Relic), support Postgres/RDS (Aurora), lead incident response and define SRE practices (SLIs/SLOs/error budgets).
Top Skills:
AuroraAWSAws RdsAzureCloudFormationEcsEksGithub ActionsGitlabGoGCPJavaKubernetesNew RelicOpentelemetryOpentofuPostgresPrometheusPythonRubyServerlessTerraformTerragrunt
Let Your Resume Do The Work
Upload your resume to be matched with jobs you're a great fit for.
Success! We'll use this to further personalize your experience.
Top Companies Hiring Site Reliability Engineers
See AllPopular Job Searches
All Software Engineer Jobs
.NET Developer Jobs
Aerospace Thermal Engineering Jobs
AI Engineer Jobs
Android Developer Jobs
Automation Engineer Jobs
Backend Developer Jobs
Blockchain Developer Jobs
C# Jobs
C++ Jobs
Cloud Architect Jobs
Cloud Engineer Jobs
Design Engineer Jobs
DevOps Engineer Jobs
Director Of Engineering Jobs
Electrical Engineering Jobs
Embedded Software Engineer Jobs
Engineering Jobs
Engineering Manager Jobs
Environmental Engineering Jobs
Field Engineer Jobs
Front End Developer Jobs
Full Stack Developer Jobs
Game Developer Jobs
Golang Jobs
Hardware Engineer Jobs
Industrial Engineering Jobs
iOS Developer Jobs
Java Developer Jobs
Javascript Developer Jobs
Linux Jobs
Manufacturing Engineer Jobs
Mechanical Engineering Jobs
Network Engineer Jobs
PHP Developer Jobs
Process Engineer Jobs
Project Engineer Jobs
Prompt Engineering Jobs
Python Jobs
QA Jobs
Robotics Engineer Jobs
Ruby on Rails Jobs
Salesforce Administrator Jobs
Salesforce Developer Jobs
Scala Jobs
Sharepoint Developer Jobs
Site Reliability Engineer Jobs
Software Engineering Manager Jobs
Solutions Architect Jobs
SQL Developer Jobs
Structural Engineer Jobs
System Engineer Jobs
Test Engineer Jobs
Web Developer Jobs
All Filters
Total selected ()
No Results
No Results

.jpg)









.jpg)





















