On May 23, 2023, a group of Warner Bros. Discovery’s senior tech leaders and various engineering teams gathered in a globally distributed war room to reveal one of the largest launches of their careers: the new Max platform.
And they were bored. But that was a very good sign.
A quiet night wasn’t a foregone conclusion, but it was the outcome for which SVP of Platform Engineering and Operations Girish Rao and the Max engineering team had spent the better part of a year preparing.
The scale of the work was monumental: over 5,000 engineers worldwide; three different cloud provider regions running over 400 microservices and over 200 databases across each region; more than 40,000 pods; build and deployment CI/CD pipelines supporting over 800 engineering source code repos; and 400 deployments at a given time.
“We were leveraging very complex technologies and tools to build a secure, scalable, reliable and efficient platform to delight our customers,” Girish said. “We could spend hours talking about microservice architectures, auto-scaling services, multiregion designs and operational readiness — but our work truly centered around investing in leading indicators of platform service and client health, with a focus on a rich customer experience.
“You want to catch the gas leak before the house catches fire,” he added.
But Girish’s team wasn’t just patching for leaks — they were building the ‘house’ (the Max platform) from the ground up.
“One of the first and most critical tenets of our build for the Max platform was starting from scratch,” VP of Site Reliability Engineering Tom Leaman told Built In. “We weren’t building on top of a legacy platform, but building a completely new product and platform from the ground up.”
He went on to explain, “To make that happen, we needed to act on one of our organizational guiding principles: Act as one team.”
WARNER BROS. DISCOVERY’S GUIDING PRINCIPLES
- Act as one team
- Create what’s next
- Empower storytelling
- Champion inclusion
- Dream it and own it
Act as One Team
In September 2022, Girish and his team set to work. The mandate was clear: build with velocity without losing quality. He had a singular question in mind en route to achieving that goal: “How do we bring all of these people together and start stitching the unified workflow?”
“How do we bring all of these people together and start stitching the unified workflow?”
The first response to that question relied on Director of Release and Delivery Engineering Tim Johnson and his team.
“As a new team, we had to decide how we deliver software, so we had to establish a normalized way of working across thousands of engineers to understand how we would deliver quality and measure ourselves,” he said.
The first test of these processes was onboarding those thousands of engineers to the new team in a matter of weeks. First, the team had to have the tools and CICD pipelines in place to begin delivering work, as well as the processes for engineers to provide feedback on that work along the way.
“We created self-serve automation tools that allowed teams to onboard themselves,” Tim said. “We cascaded throughout the organization over a matter of weeks with the documentation and seminars that would answer the questions of how to get work done and set expectations.”
Those expectations were standard across the team, from front-end client teams to back-end service teams and everyone in between. The expectations weren’t just in place to simplify onboarding — they also allowed the Max team to have clear insights into the quality needed to validate and programmatically promote work from non-production to production environments.
“We avoided a constant moving needle that would have required people to ask why something was considered good before but not later,” Tim said. “This widespread clarity required control, discipline and communication across our teams to make sure the quality bar was always ratcheting up as we progressed toward our launch date.”
Create What’s Next
The Max team’s intentional approach to building and testing found 5,000 engineers working across four separate environments that allowed for different layers of functionality: development, integration, staging and production.
The initial environments were first made available six months before launch, and the two-week sprint cycles began. This set the release cadence to begin promoting builds into higher environments. By February 2023, builds began moving into the staging environment to harden functionality, scale and performance targets — in parallel, the push to production had begun for content and migration workflows.
INSIDE THE APP
Very few tech professionals have the opportunity to launch an app with tens of millions of subscribers already ready to log on. Much of the complex work at the foundation of the new Max application and user interface are invisible, from seamless user migration that would retain what a viewer was watching and where they had left off to building a horizontally-scalable foundation for expansion. What many did notice when logging on to Max for the first time were enhanced parental controls, new content navigation tools and double the amount of content that users had access to in the past. And the expansion is on pace to continue, with growth in LATAM and EMEA markets and new sports content including extensive Olympic coverage ahead.
But the Max team wasn’t working up to the wire to push to production and identify bugs. With a week to go before launch, the production environment was ready and “frozen.”
Tim’s team was paving the way toward a smooth launch.
“Our launch week started months prior to the actual day,” he said. “We made sure that our engineering, service and client application teams would have the tools they needed for a successful launch. If they needed to apply a patch or deploy a fix, we had those available and scalable, and we were ready to handle hundreds of deployments at once.”
“Our launch week started months prior to the actual day.”
With tools like canary analysis to get insights into early feedback and a pre-certification process for the vendors and partners delivering the app to end consumers, alignment between release and product managers made sure the apps were going to ship smoothly — months before launch.
“When we had a final version ready to ship in the app store, we were confident that it would go out that night,” Tim said. “In those last weeks, we held the line on prioritizing what actually needed to go into production and maintained a very high bar for quality. We relied on coordination, communication and clear expectations as we approached launch day.”
After the freeze for production, the team turned to other priorities.
“We built infrastructure and capacity. We made sure security policies were good. We knew what we needed to tweak in the runbooks in order to flip the switch,” Girish said. “And then before actually flipping that switch, we had about four days of breathing room.”
Those days were a crucial step toward launch, according to Girish. The team had time to process all the work they had done, to rest, refresh and get their final runbooks in place.
“Throughout the quiet period, the environment was humming,” Girish said. “It was good to go.”
And then it was time to open up the launch war room.
Dream It and Own It
While the team sat together in the war room on launch day, they were prepared for every contingency with confidence.
While Girish and other senior leaders held down one of WBD’s central tech hubs in Bellevue, Washington, global teams were ready to support in time zones around the world.
“Turning on the services was very smooth and effortless,” Girish said. “We had multiple rehearsals by the launch team and an exhaustive runbook budgeted by time and outcomes to verify that we were checking every line. And when we went live and customers began to experience Max content and product, we went into hypercare with folks on call 24/7 to resolve minor issues, identify any quick tweaks and catalog findings for enhancements as part of the next platform update.”
According to Tom, months of planning the launch day runbook and coordinating clear roles across launch and hypercare began to quickly pay off.
“As the site reliability guy, it brings joy to my heart when nothing is on fire,” he said. “There were no questions as to what each person’s responsibilities were, and we weren’t waiting to hear from customers because our developers had heavily instrumented our application so we would know exactly what issues were popping up.”
“It brings joy to my heart when nothing is on fire.”
In the end, Girish and the team experienced the best possible outcome.
“It was smooth, boring and successful,” he said. “A moment of absolute pride for all of us that months of effort came to life.”
Achieve More Than Imaginable
Even beyond the landmark launch, the scalable, next-generation streaming platform and the nimble processes that made it possible, the talented individuals across the Warner Bros. Discovery team truly found themselves acting as one team.
“We formed lifelong bonds and memories on that journey,” Tom said. “Everybody involved achieved more in less than a year than any of us could have imagined.”
That level of achievement not only led to a culture of pride and ownership across the team, it also helped people unlock unexpected levels of achievement.
“This is a once-in-a-professional-journey opportunity for most of our team,” Girish said. “We found hidden gems in teams, and people flexed muscles they didn’t know they had.”
But even for a seasoned leader like Girish, the Max platform was a landmark occasion with continued opportunities for growth as the team continues to scale the global platform.
“I’ve been responsible for many big launches, and this wasn’t even top three — this absolutely steals the top.”