The Answer to Your Ridiculously Complicated System Architecture? Platformization.
Complexity is the double-edged sword in software engineering. As we scale systems and the organizations that support them, we add moving parts to increase capacity. That complexity makes it more difficult to understand the ramification of our changes, but avoiding complexity altogether can limit how much the system can handle. I often feel I’m choosing between two problems: Either we have to figure out how to break up a monolithic structure while not disturbing the valuable traffic pulsing through it, or we have to figure out how to facilitate local development of service-orientated structures without asking engineers to essentially download a full copy of production onto their laptops. Occasionally, engineering teams may face both of these problems at the same time, but every engineering team I’ve been on has had to solve at least one of them.
How do we end up building such complex things all the time? And when should we avoid adding complexity to a system?
The solution requires understanding platformization.
What Is Platformization?
In this context, platformization is the process of capturing core functionality in abstractions that minimize the need for software engineers to build support for that functionality themselves. Often this takes the form of tools or frameworks that can more easily sit between an existing piece of technology and an application built on top of it. An operation system is full of these types of abstractions, as are container and orchestration systems like Docker or kubernetes. Deploy tools are often built as part of a platform. Shared services are often built as part of the platform.
Platforms help us manage complexity by reducing how much of the complete picture we need to understand. A software engineer does not have to understand the differences between various chip architectures because the platform handles it for her. She doesn’t need to learn multiple dialects of SQL because the ORM builds her queries for her.
What Is Platformization?
The key is generalization. Platformization is dividing out parts of the system that can be generalized enough to operate as a black box. It’s not that the system is simpler, because more abstractions invariably mean greater complexity. Instead, platformization makes a system seem simpler by delegating how much of the complexity any one engineer must understand at one time. An application developer need not understand the complexity of the container runtime, but the engineers working on containerization tools absolutely do. In modern-day architectures, the complete system can never be fully understood, but by building platforms, we lower the barrier to productive contributions. Engineers onboard faster and push effective code sooner.
When done well, platformization also benefits the health of the systems. When a set of functionality can be generalized to the point where an individual engineer does not need to understand it, the engineering afterward does not need to change that functionality. For that reason, the incentives that drive platformization are often the benefits of restricting who can make changes to the system.
Abstraction and Generalization
But not all abstractions are generalizations. There are tools and frameworks in our stacks that add variation rather than minimize it. A good way to tell the difference is by asking yourself how well an engineer needs to understand the interface of the dependencies operating under the abstraction in order to effectively use it.
That’s not to say that such frameworks are not useful, but systems get harder to maintain as their complexity rises. So if a particular framework or tool is not minimizing exposure to that complexity, then system architects should think carefully about the value they add. When people don’t understand how to change something correctly, they quickly find multiple ways to change it incorrectly.
Platformization and Scale
The benefit of platformization is carving up the system’s complexity into specific scopes that can be delegated to specialized engineers and safely ignored by everyone else. This makes the system seem simpler, but this simplicity is an illusion. Every new moving part adds a greater level of complexity. For that reason, the platform an engineering team relies on should reflect the scale of the team itself. Attempting to build a platform too soon — before you have the dedicated staff to delegate to — is another common failure.
Fortunately, many of the components of a modern-day platform can be managed by teams external to the engineering organization itself. AWS offers a number of managed services, as do most of its cloud competitors. Using open-source tools with a default, out-of-the-box configuration is another way of delegating complexity to an external team while the internal team scales up.
Engineers are often drawn to the glimmer of Google-scale technology, but economies of scale are as common in IT as they are in manufacturing. Google-scale solutions are overkill at a startup. Trying to build for 10X or 100X the traffic you actually have is a good way to introduce lots of unnecessary complexity.
Knowing when to build out your platform, how much of it to build out, and what abstractions fit a platform model takes practice and experience. As you make strategic decisions about your own architecture, keep these three rules in mind:
- Platforms delegate complexity to specialized teams. Only the team that maintains and iterates on an abstraction needs to understand the dependencies; everyone else interacts with a black box that just works.
- Platforms minimize variance. It should be easy for engineers building on top of the platform to do the right thing and close to impossible for them to push the abstraction into an edge case or an unknown state.
- Not all abstractions generalize. Some abstractions are useful because they add missing functionality, increase performance, or package code so that product is easier to ship.
Building a platform for the sake of having a platform is not beneficial, but when the organization has achieved (a) the scale where the burden of complexity is slowing things down and (b) has the resources to start compartmentalizing and delegating that complexity to specific teams, a platform can be an excellent way to ensure the technology can continue to grow.