Why You Should Be Wary of Software Dependencies
The software world is made up of cascading layers of technologies, each building on the framework of those that came before it to do more, better, faster, easier.
This is how software progresses: Rather than reinventing the wheel each time you begin a new program, you may draw upon the collected work and intelligence of countless programmers who have come before you.
Dependency on these underlying technologies isn’t free though. It comes with a wide variety of costs that must be carefully considered to ensure that the savings in time and expertise are truly worth it.
Let’s explore some of those costs of software dependencies.
What Are Some of the Costs of Software Dependencies?
- Questionable time savings
- Version management
- Requirement of non-native knowledge
- Inability to troubleshoot problems
Questionable Time Savings
The primary motivation for introducing dependencies to a software package is to save time. It doesn’t make sense to write a new programming language from scratch, invent your own cryptography library, or build a web framework from the ground up every time you start a new project. At 4Degrees we use a couple dozen software dependencies to solve these exact kinds of problems.
However, there’s been a phenomenon in the past decade or so to use dependencies for virtually everything. “Don’t reinvent the wheel” has been taken to the conclusion “never make a wheel when you can find one.” The problem is that introducing a new dependency to a project isn’t always the time-saver it’s cracked up to be. Researching, selecting, implementing, and tweaking a dependency has a real time cost that in some cases outweighs the time savings all on its own.
For example, in the early days of 4Degrees, we added in a pre-built solution to validate email addresses. We knew email addresses had to follow some set rules and patterns but figured that we should just outsource that knowledge. We ended up stripping out that software package after about a year because it turned out that it wasn’t filtering out all invalid addresses. In the process, we found a “from scratch” method of email validation that required just two lines of code and solved 99 percent of our use cases.
When adding dependencies to a project, engineers tend to think of the addition as a point-in-time snapshot of capabilities. And for simple packages that may be the case: The code you implement isn’t going to change and you can forget about it. But most dependencies have their own live development and versions that progress over time.
Issues arise when the progression of a dependency’s versions causes substantial changes in its interactions with the rest of the project. It’s not uncommon to have a situation where you have something like version 1.0 installed in your production environment and then all the engineers have versions ranging from 1.2 to 1.5 installed in their local environments. These different versions may result in meaningfully different interactions with the rest of the codebase that can be incredibly tricky to diagnose and troubleshoot.
Of course, there are solutions to this versioning conundrum (like containerization or explicit version specification where possible), but none of them are perfectly simple or foolproof. Versioning challenges can often be a significant source of headache on teams, particularly where dependencies have proliferated wildly.
Requirement of Non-Native Knowledge
Part of becoming a productive member of a tech team requires learning the team’s practices and standards. These include learning code style guidelines, coding philosophies, and best practices and patterns. This soft knowledge is different across teams, often takes months to fully develop, and has a meaningful impact on an individual’s ability to contribute to the codebase.
One largely unacknowledged cost of dependencies is that they disrupt these team norms and standards. Because dependencies are, by definition, developed externally, they will invariably have their own sets of standards and patterns that don’t align with those used by the team and project. The introduction of these foreign ways of thinking can disrupt the established patterns in the project, and mean that any efforts to work with the software package will necessarily be less than fully efficient.
The larger the dependency, the more specific the required knowledge to work with it. In fact, many technologies are complex enough that full industries have sprung up around working with them and their specialized knowledge. One example is the wildly popular content management system (CMS), WordPress. While part of the draw of a CMS like WordPress is that it allows non-technical people to build websites, the reality is that it is so complex that it requires in-depth proprietary knowledge that can take months or even years to learn.
Inability to Troubleshoot Problems
If you have enough complex dependencies, you inevitably will begin to run into issues and bugs. It’s just the nature of software or any complex system. Unfortunately, issues can be much more difficult to deal with when they arise in dependencies as opposed to when they’re in your native code.
The non-native knowledge required to navigate a foreign codebase is one clear hill to climb. Because the dependency doesn’t follow your own project’s coding standards, it will be harder for your engineers to navigate its logic and find root issues.
In many cases, dependencies are set up in such a way that they’re over-abstracted for any given problem. That’s because these software packages are built for a wide audience’s consumption, as opposed to being tailored to the specific needs of your project. As such, troubleshooting and debugging the more abstract and complex code can be more challenging.
In some cases, dependencies may introduce entire classes of new problems that are unrelated to any of the underlying work that your team is trying to do. One prime example of this at 4Degrees comes from our use of a package called SQLAlchemy. SQLAlchemy is an abstraction layer on top of SQL that manages all of our interactions with our database. The work SQLAlchemy does for us is complex, knowledge-heavy, and invaluable to us — it’s the perfect use case for a dependency.
At the same time, we have a myriad set of errors that arise from SQLAlchemy’s connectivity with our database. Not bugs with our code or with the database, but instead issues that arise from SQLAlchemy’s own code and complexity. SQLAlchemy provides enough benefits that it doesn’t make sense to strip it out completely; we’re stuck with the necessary evil of fighting these bugs as they arise. Of course, the time we spend working on these bugs is time that could otherwise be spent making our product better.
The nice thing about code you’ve written is that it’s your code. You can improve it, rewrite it, adapt it, and grow it whenever you need to. Not so for external dependencies.
While many packages have some configurability and flexibility built in, the reality is that you’re stuck with what you’ve got. The dependency was built to solve a certain set of problems in a certain way. If your own environment doesn’t perfectly match those assumptions then you may be stuck with a suboptimal or even broken solution.
We use an external package to interface with Microsoft’s Outlook services. This package allows us to retrieve mail and calendar information for our users. It’s another one of those complex and knowledge-dependent contexts where a dependency is perfectly suited. And the one we’ve chosen works great ... about 99 percent of the time. Unfortunately, that extra 1 percent of the time we’ll run into some kind of data or environment that our package wasn’t built to handle and the whole system can come to a screeching halt.
Of course, it may be possible to modify a dependency with your own custom code to adapt it to your specific needs. If that’s the route you take, you have an uphill battle ahead of you. First, you have to navigate the lack of native knowledge I previously discussed that prevents efficient interaction with a dependency’s codebase.
More importantly, any changes you make to a dependency software package mean you’re no longer synced with the development and standard versioning of the package. That may be workable in some cases, but will almost always at least introduce a headache of version management across environments.
Dependencies are a reality of software development. No one starts from machine code to build their projects — nor should they. Software development is so powerful and efficient these days because of all the hard work that past experts have put into building technologies that we can use as building blocks.
But reliance on dependencies can easily turn into over-reliance, with a variety of costs and downsides that negatively impact your project or business. When considering the addition of a new dependency, it’s important to be thoughtful about all of the tradeoffs you’re making. Sometimes it really is best to build that wheel from scratch.