Senior Manager, Site Reliability Engineering

Reposted 18 Days Ago
Be an Early Applicant
San Francisco, CA, USA
In-Office
227K-325K Annually
Senior level
News + Entertainment
The Role
Lead and grow the Site Reliability Engineering team, ensuring platform reliability and performance through strategic planning and incident management. Champion AI integration into SRE practices to improve automation and operational efficiency while fostering a blameless learning culture.
Summary Generated by Built In

About the Role:

Site Reliability Engineering (SRE) at Tubi is not a traditional operations team. We are a software engineering organization that applies a developer's mindset and toolkit to the challenges of building and running large-scale, distributed systems. Our mission is to engineer resilience from the ground up, enabling our product teams to innovate rapidly while ensuring our users have a stellar experience. We own the availability, latency, performance, and capacity of our platform, and we achieve our goals through a culture of data-driven decision-making, blameless learning, and relentless automation.

We are seeking an experienced and visionary Senior SRE Manager to lead and grow our newly built Site Reliability Engineering team. You are more than a people manager or a tech lead; you are the strategic leader responsible for architecting our reliability roadmap. You will build and mentor a team of talented engineers, foster a culture of blameless learning and continuous improvement, and champion the engineering practices that allow us to balance rapid innovation with rock-solid stability. You will be a key influencer in our engineering leadership, partnering with peers across the organization to ensure reliability is a shared responsibility and a core tenet of our engineering culture.

What You'll Do:

  • Team Leadership & Mentorship:
    • Lead, mentor, and grow a team of Site Reliability Engineers. Foster a culture of innovation and technical excellence where engineers feel empowered to do their best work. Provide personalized coaching, create professional development plans, and guide the careers of senior and emerging talent within the team.
    • Establish equitable, sustainable on-call practices (including global coverage where applicable) that protect focus time and avoid burnout.
    • Define team rituals - runbook reviews, game days, and incident retros - that reinforce quality and learning.
  • Strategic Planning & Vision: Define and drive the multi-year technical strategy and vision for Tubi’s observability, and automation platforms. Partner with infra lead to align Tubi’s infrastructure & SRE roadmap. Partner with tech leaders to align the SRE roadmap with business objectives. Champion a data-driven approach to reliability, using Service Level Objectives (SLOs) and error budgets to facilitate productive conversations about risk and feature velocity.
  • Operational Excellence & Incident Management: 
    • Own the end-to-end availability, performance, and efficiency of our critical user-facing services. Evolve our incident response practice to reduce Mean Time to Resolution (MTTR) and Mean Time Between Failures (MTBF). Champion a rigorous, blameless, and data-driven post-mortem culture to ensure we learn from both successes and failures, driving eng teams for systemic fixes and automation to prevent the recurrence of incidents.
    • Streamline and improve our existing processes and practices, and collaborate with other teams to enhance our production release standards by improving current processes.
    • Define and tune a 24×7 on-call rotation for low noise and fast response; act as executive escalation partner during major incidents.
    • Own disaster-recovery strategy (playbooks, failover drills, recovery simulations) and track SLO gaps with time-bound remediations.
  • Financial & Vendor Management: Own the SRE budget, tooling, and headcount. Manage relationships with key third-party vendors for our observability and SRE related AI platforms, work with infra lead and finance team for contract negotiations and ensure we derive maximum value from our investments.
  • Cross-Functional Collaboration: Act as a key influencer and strategic partner to leaders in Software Engineering, Product Management, and Infra/Sec. Drive the adoption of SRE best practices and principles throughout the organization, ensuring new services are designed for reliability, scalability, and observability from day one.
  • The AI Mandate: Building the Future of Observability with AI. You will not just manage a team that uses AI; you will lead the charge in building an AI-native SRE function. This is a strategic mandate that requires a forward-thinking leader who understands both the potential and the pitfalls of integrating intelligent systems into critical operations. This includes:
    • AIOps Strategy Development: Developing and executing the strategy for integrating AIOps and machine learning into our observability stack. Your goal will be to move the team from a reactive monitoring posture to one of predictive maintenance and automated anomaly detection, fundamentally changing how we ensure reliability.
    • Accelerating Automation with AI: Championing the effective and responsible use of AI-assisted coding tools (e.g., Claude Code, Cursor) within the SRE team. You will set the standards and practices to leverage these tools to accelerate the development of automation, operational tooling, and infrastructure code.
    • Building the Business Case: Building the techno-economic case for new AI tooling, managing vendor relationships, and ensuring the cost-effective and secure implementation of these powerful systems. You must be able to articulate the ROI of these investments in terms of reduced downtime, improved operational efficiency, and faster incident resolution.
    • Fostering Critical AI Literacy: Fostering a culture that can critically evaluate, debug, and learn from the outputs of AI systems. This involves extending our blameless post-mortem philosophy to AI-driven actions and recommendations, ensuring that the team remains in control and understands the "why" behind automated decisions.

Your Background:

  • 8+ years of experience in a technical field, with at least 3+ years in an engineering leadership position managing SRE, DevOps, or Production Engineering teams.
  • A deep, principled understanding of SRE tenets, including Service Level Indicators (SLIs), SLOs, error budgets, toil reduction, and capacity planning.
  • Exceptional communication, negotiation, and influencing skills, with the ability to articulate complex technical concepts and strategies to both technical and non-technical stakeholders at all levels of the organization.
  • A strong technical background as a hands-on software engineer or site reliability engineer prior to moving into management. Deep knowledge of AWS services (especially networking, IAM, EKS, ALBs/NLBs, Route 53, CloudWatch). Proven experience with Kubernetes in production (EKS preferred), including service exposure, networking, and availability engineering.
  • Hands-on familiarity with modern SRE tools and technologies, including Infrastructure as Code (e.g., Terraform, Ansible), container orchestration (Kubernetes), observability platforms (e.g., Prometheus, Grafana, Datadog, Splunk), and incident tooling (e.g., PagerDuty, FireHydrant), deployment-safety tooling (e.g., Argo Rollouts, LaunchDarkly), and observability standards (e.g., OpenTelemetry).

#LI-BT1

#LI-Hybrid 

Pursuant to state and local pay disclosure requirements, the pay range for this role, with final offer amount dependent on education, skills, experience, and location, is listed annually below. This role is also eligible for various benefits, including medical/dental/vision, insurance, a 401(k) plan, paid time off, and other benefits in accordance with applicable plan documents.

High cost labor markets such as but not limited to Los Angeles, New York City, and San Francisco
$227,200$324,500 USD

Tubi is a division of Fox Corporation, and the FOX Employee Benefits summarized here, covers the majority of all US employee benefits.  The following distinctions below outline the differences between the Tubi and FOX benefits:

  • For US-based non-exempt Tubi employees, the FOX Employee Benefits summary accurately captures the Vacation and Sick Time.
  • For all salaried/exempt employees, in lieu of the FOX Vacation policy, Tubi offers a Flexible Time off Policy to manage all personal matters.
  • For all full-time, regular employees, in lieu of FOX Paid Parental Leave, Tubi offers a generous Parental Leave Program, which allows parents twelve (12) weeks of paid bonding leave within the first year of birth, adoption, surrogacy, or foster placement of a child in addition to applicable government leave program(s) and FOX’s short-term disability policy. This time is 100% paid through a combination of any applicable state, city, and federal leaves and wage-replacement programs in addition to contributions made by Tubi.
  • For all full-time, regular employees, Tubi offers a monthly wellness reimbursement.
About Tubi:

Boldly built for every fandom, Tubi is a free streaming service that entertains over 100 million monthly active users. Tubi offers the world's largest collection of Hollywood movies and TV shows, thousands of creator-led stories and hundreds of Tubi Originals made for the most passionate fans. Headquartered in San Francisco and founded in 2014, Tubi is part of Tubi Media Group, a division of Fox Corporation.

We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, gender identity, disability, protected veteran status, or any other characteristic protected by law. We will consider for employment qualified applicants with criminal histories consistent with applicable law.

Skills Required

  • 8+ years of experience in a technical field
  • At least 3+ years in an engineering leadership position managing SRE, DevOps, or Production Engineering teams
  • Deep knowledge of AWS services, especially networking, IAM, EKS, ALBs/NLBs, Route 53, CloudWatch
  • Proven experience with Kubernetes in production
  • Hands-on familiarity with SRE tools and technologies
  • Exceptional communication, negotiation, and influencing skills

Tubi Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Tubi and has not been reviewed or approved by Tubi.

  • Healthcare Strength Medical, dental, and vision start from day one with multiple plan options under the parent-company framework; a high employer premium share and added supports like HSA funding and mental‑health resources strengthen coverage.
  • Retirement Support A 401(k) with matching plus an additional company contribution and a student‑debt program that counts eligible payments toward matching indicate robust long‑term savings support.
  • Parental & Family Support A dedicated 12 weeks of paid parental bonding leave, along with backup care and adoption/surrogacy reimbursements, provides meaningful family assistance.

Tubi Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
504 Employees
Year Founded: 2014

What We Do

Tubi, a division of FOX Entertainment, is an ad-supported video-on-demand service with more than 40,000 movies and TV shows in the U.S., including a growing library of Tubi Originals, 145+ local and live news, sports and entertainment channels, and 400+ entertainment partners, featuring content from every major Hollywood studio. Tubi gives fans of film, television, news and sports an easy way to discover new content that is completely free. Tubi is available in the U.S. on Android and iOS mobile devices, Amazon Echo Show, Google Nest Hub Max, Comcast Xfinity X1, Cox Contour, and on connected television devices such as Amazon Fire TV, Vizio TVs, LG TVs, Sony TVs, Samsung TVs, Roku, Apple TV, Chromecast, Android TV, PlayStation 4 & 5, Xbox One & Series X|S, and soon on Hisense TVs globally. Consumers can also watch Tubi content on the web at www.tubi.tv.

Similar Jobs

NVIDIA Logo NVIDIA

Senior Manager, Site Reliability Engineering

Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
In-Office
Santa Clara, CA, USA
21960 Employees
200K-322K Annually

General Motors Logo General Motors

Senior Manager, AV Site Reliability Engineering

Automotive • Big Data • Information Technology • Robotics • Software • Transportation • Manufacturing
Hybrid
2 Locations
165000 Employees
160K-240K Annually
In-Office
2 Locations
34450 Employees
153K-262K Annually
Hybrid
3 Locations
1100 Employees
189K-351K Annually

Similar Companies Hiring

Philo Thumbnail
On-Demand • News + Entertainment • Digital Media • Cloud
San Francisco, CA
165 Employees
Sandbox VR Thumbnail
Events • Gaming • News + Entertainment • Retail • Virtual Reality
Tsim Sha Tsui East, Kowloon
650 Employees
Hedra Thumbnail
Software • News + Entertainment • Marketing Tech • Generative AI • Enterprise Web • Digital Media • Consumer Web
San Francisco, CA
14 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account