Senior Site Reliability Engineer

Reposted 9 Days Ago
Hiring Remotely in United States
Remote
Senior level
Generative AI
The Role
The Senior Site Reliability Engineer will enhance cloud infrastructure, enforce SRE best practices, manage scalable systems, and mentor junior team members.
Summary Generated by Built In

< Remote - United States >

Job Description:
Stability AI’s Engineering Operations team is looking for a Senior Site Reliability Engineer (SRE) to join our growing team and play a pivotal role in improving and shaping our cloud infrastructure. The person will closely work with engineering, IT, security, and product teams to drive innovation and reliability in an evolving environment. Candidates should have the initiative to build and improve a maturing cloud landscape.

Responsibilities:
  • Developing and enforcing SRE best practices and standards across the organization.
  • Architecting and managing scalable systems in AWS and other cloud environments, focusing on high availability and resilience.
  • Implementing and maintaining infrastructure as code using Terraform.
  • Setting up and refining monitoring, logging, and alerting systems.
  • Driving incident management and root cause analysis to improve system reliability.
  • Championing SRE principles and mentoring junior team members.
Qualifications:
  • Collaborating with development teams to enhance CI/CD pipelines.
  • Experience scaling resource intensive systems, be it storage, networking, or compute.
  • Knowledge and experience with Kubernetes or other container scaling solutions
  • Background in software development or automation scripting.
  • Knowledge and experience with Grafana, ELK stack, or similar tools.
  • Cloud security experience.

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.


Top Skills

AWS
Elk Stack
Grafana
Kubernetes
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: London
149 Employees

What We Do

Stability AI is building open AI tools that will let us reach our potential.

Designing and implementing solutions using collective intelligence and augmented technology.

Similar Jobs

CertifID Logo CertifID

Senior Site Reliability Engineer

Legal Tech • Real Estate • Security • Software • Cybersecurity • PropTech
Easy Apply
Remote or Hybrid
2 Locations
130 Employees

Coinbase Logo Coinbase

Senior Site Reliability Engineer

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Easy Apply
Remote
USA
4000 Employees
186K-219K Annually

MongoDB Logo MongoDB

Senior Site Reliability Engineer

Big Data • Cloud • Software • Database
Easy Apply
Remote or Hybrid
7 Locations
5550 Employees
127K-249K Annually

Zeta Global Logo Zeta Global

Senior Site Reliability Engineer

AdTech • Artificial Intelligence • Marketing Tech • Software • Analytics
Easy Apply
Remote or Hybrid
United States
2429 Employees
140K-170K Annually

Similar Companies Hiring

Northslope Technologies Thumbnail
Software • Information Technology • Generative AI • Consulting • Artificial Intelligence • Analytics
Denver, CO
88 Employees
ClickMint Thumbnail
Marketing Tech • Generative AI • eCommerce • AdTech
Malibu, CA
9 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account