Lead Site Reliability Engineer

Posted 4 Days Ago
Hiring Remotely in US
Remote
3-5 Years Experience
Information Technology • Software
We empower all teams to deliver and control their software.
The Role
Lead the development and refinement of SRE tools and processes, enable engineers to deliver services with autonomy and reliability, design disaster recovery plans, mentor team members, drive technology adoption, and identify performance bottlenecks.
Summary Generated by Built In

About the Job: 

Software powers the world, and LaunchDarkly empowers all teams to deliver and control the best software. We serve trillions of feature flags daily to help teams ship better software faster and eliminate risk for companies big and small.

We're based in downtown Oakland and growing quickly. You'll help us tackle some of the most challenging engineering problems around, like delivering feature flags to hundreds of millions of users worldwide in milliseconds.

In this role, you'll oversee the health of our core systems and reliability tooling, respond to and mitigate incidents quickly, and identify and drive opportunities that make our core services more resilient. You will also identify and develop force-multiplying capabilities for our internal engineering teams, helping our engineers become more effective at shipping robust code and thinking about reliable design earlier in the life cycle.

Our core daily technologies include AWS, Golang, CockroachDB, ElasticSearch, Redis, Flink, Kinesis, and Terraform.

Responsibilities:

  • Lead the development and continuous refinement of SRE tools and processes to improve software delivery, observability, reliability and operational efficiency. Your impact extends beyond your team’s boundary to proactively improve our overall service health.
  • Up level our engineering team to deliver their services with higher autonomy, reliability, and performance through offerings written in Go and Terraform, or delivered through existing tools.

  • Define and standardize service health and reliability metrics that align with business goals, and ensure these metrics are transparent and actionable.

  • Help improve the effectiveness of our incident management lifecycle and drive initiatives to train key roles involved in incident response and our post-incident review process.

  • Partner with various team members to define and mature our SRE culture through principles, technical frameworks, tooling, and processes. You will mentor and coach SRE team members and engineers in adjacent teams to promote a culture of SRE learning and growth.

  • Drive the adoption of new technologies, system designs and best practices in code health, testing, observability, and service maintainability across teams.

  • Proactively identify and resolve potential performance and scalability bottlenecks in our front-end and back-end systems and underlying infrastructure.

  • Analyze the performance of SQL queries, suggest improvements and build guardrails for teams.

Qualifications:

  • Demonstrable experience building and operating large-scale, highly available distributed systems; You also possess advanced analytical skills to anticipate and mitigate complex system behaviors and incidents before they impact our customers.

  • Comfort with server-side web development (e.g., in Java / Scala, Ruby, Python, Golang, Node.js) and Infrastructure-as-Code (e.g., Terraform.)

  • Experience guiding the architectural direction and scalability considerations for new projects.

  • Strong understanding and proactive management of security practices related to SRE, coordinating with our Security team to fortify infrastructure.

  • Extensive experience working with major cloud providers, observability tooling, and RDBMS technologies is crucial for this role.

  • Experience leading team ceremonies: project ideation, planning, grooming, and project retrospectives. You will also drive alignment on decisions with cross-team impact, identify areas of misalignment across the team, and bring stakeholders together to realign.

  • Strong customer focus and ability to make technical decisions that tie back to business goals.

  • Exceptional communication skills, a positive attitude, and a high degree of empathy

Pay:

Target pay ranges based on Geographic Zones* for Levels P4-P5:

  • Zone 1: San Francisco/Bay Area or New York City Metropolitan Area: $183,600 - $235,000**

  • Zone 2: Boston, DC, Irvine, LA, Monterey, Santa Barbara, Santa Rosa, Seattle: $165,600 - $212,000**

  • Zone 3: All other US locations: $156,510 - $200,000**

*Restricted Stock Units (RSUs), health, vision, and dental insurance, and mental health benefits in addition to salary.

LaunchDarkly operates from a place of high trust and transparency; we are happy to state the pay range for our open roles to best align with your needs. Exact compensation may vary based on skills, experience, degree level, and location.

About LaunchDarkly:

Modern software delivery was supposed to be the foundation for a thriving digital business but reality has proven otherwise. Slow, inefficient development cycles, costly outages, and fragmented customer experiences are preventing developers from building their best software. The LaunchDarkly platform helps developers innovate on new features faster while protecting them with a safety valve to instantly rewind when things go wrong. Developers can target product experiences to any customer segment and maximize the business impact of every feature. And by gradually rolling out new application components, they escape nightmare "big-bang" technology migrations. 

The LaunchDarkly platform was built to guide engineers to the next frontier of DevOps by:

  • Improving the velocity and stability of software releases, without the fear of end customer outages

  • Delivering targeted experiences by easily personalizing features to customer cohorts

  • Maximizing the business impact of every feature through the ability to experiment and optimize

  • Coordinating the release and optimization of software to provide consistent experiences across mobile platforms and device types

  • Improving the effectiveness and productivity of engineering teams, by providing insights into engineering cadence and stability

At LaunchDarkly, we believe in the power of teams. We're building a team that is humble, open, collaborative, respectful and kind. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, gender identity, sexual orientation, age, marital status, veteran status, or disability status. LD invites any applicant to review our written Affirmative Action Plan. To do so, contact People Ops at [email protected].

Do you need a disability accommodation?

Fill out this accommodations request form and someone from our People Operations team will contact you for assistance. 

Top Skills

Go
Terraform
The Company
HQ: Oakland, CA
500 Employees
Remote Workplace
Year Founded: 2014

What We Do

LaunchDarkly isn’t just a leader in feature management — it’s the first scalable feature management platform. Feature management allows development teams to innovate faster by fundamentally transforming how software is delivered to customers. With the ability to gradually release new software features to any segment of users on any platform, DevOps teams can standardize safe releases at scale, accelerate their journey to the cloud and collaborate more effectively with business teams.

Today, LaunchDarkly deploys peaks of 20 trillion feature flags a day, and that number continues to grow. Founded in 2014 in Oakland, California by Edith Harbaugh and John Kodumal, LaunchDarkly has been named on the Forbes Cloud 100 list, InfoWorld’s 2021 Technology of the Year list, and the Enterprise Tech 30 list.

At LaunchDarkly, we believe in the power of teams. We're building a team that is humble, open, collaborative, respectful and kind. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, gender identity, sexual orientation, age, marital status, veteran status, or disability status.

Why Work With Us

We're Oakland-based but Remote-first and have one of the few women CEOs in our industry.

Top reasons to work at LaunchDarkly:
Great work/life balance and unlimited PTO, Awesome culture and human-centric values, Product is a "Need to have": Category leader in Feature Management, Competitive Pay and Healthcare Benefits, Pre-IPO Stock

Gallery

Gallery

Jobs at Similar Companies

Jobba Trade Technologies, Inc. Logo Jobba Trade Technologies, Inc.

Customer Success Specialist

Cloud • Information Technology • Productivity • Software
Hybrid
Chicago, IL, USA
45 Employees

MassMutual India Logo MassMutual India

BI Support Developer

Big Data • Fintech • Information Technology • Insurance • Financial Services
Hyderabad, Telangana, IND

Silverfort Logo Silverfort

Enterprise Customer Success Manager

Information Technology • Sales • Security • Cybersecurity • Automation
Remote
United States
357 Employees

Similar Companies Hiring

MassMutual India Thumbnail
Insurance • Information Technology • Fintech • Financial Services • Big Data
Hyderabad, Telangana
Silverfort Thumbnail
Security • Sales • Information Technology • Cybersecurity • Automation
GB
357 Employees
Jobba Trade Technologies, Inc. Thumbnail
Software • Productivity • Information Technology • Cloud
Chicago, IL
45 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account