Despite the stereotype of programmers being introverts, modern software development is not a solo endeavor. Software made today depends in large part on the work of other developers, in the form of countless libraries and packages that help with tasks like error logging and access control.
The use of these added libraries — called dependencies — can make software more secure and less error-prone, but it also means there are portions of most codebases that are opaque even to their developers. And since dependencies are themselves software, they are also vulnerable to mistakes and security holes, which are then inherited by software that’s using them.
Maya Kaczorowski, senior director of product management and software supply chain security at GitHub, said that’s why it’s important for projects to stay up to date on their dependencies.
“It’s much easier to apply a patch when it’s really critical if you’re closer to the latest version,” Kaczorowski said. “Small software changes are easier than big software changes. If you’re constantly updating your dependencies, you can handle those changes over time rather than having them all at once.”
“It’s much easier to apply a patch when it’s really critical if you’re closer to the latest version.”
GitHub acquired a tool in 2019 that helps developers manage their dependencies, called Dependabot. It belongs to the category known as software composition analysis (SCA) tools, which map out projects’ dependencies and checks if any are out of date.
“That’s how all the SCA tools work,” Kaczorowski said. “Basically, what you’re doing is a look-up of the information of your dependencies to a known database of vulnerable dependencies.”
Dependabot is baked into GitHub, which makes tracking dependencies easy for users of the source control platform. The tool sends alerts whenever new updates or security patches appear, and developers can also have Dependabot automatically create pull requests to merge those updates, leaving only reviewing and merging tasks to the developers.
Finding Dependencies Isn’t a Trivial Task
Tools exist to help with this process because it can get quite complex. Even the first step of identifying which dependencies a project has is less than straightforward.
“It’s actually really hard to know if you’re affected even with these tools,” Kaczorowski said. “Figuring out, ‘Do I actually use a given vulnerability anywhere?’ is not an easy task for the average company today.”
Kaczorowski said that’s because projects can include lots of code that make vulnerable dependencies difficult to spot.
“It’s not that it’s technically hard, it’s that your code is everywhere,” she said. “You have a ton of dependencies, and there’s one repo, and there’s another repo, and you have these pre-built packages, the binaries that you’re checking in — knowing if you actually have it in your environment, and also that you actually deployed it, is not a trivial task.”
“It’s actually really hard to know if you’re affected even with these tools.”
Dependabot accomplishes this task by looking at projects’ manifest and lock files — files within projects where developers report dependencies. Kaczorowski said this works for many projects but it’s not enough for others, because keeping manifest files up to date depends on development frameworks and the diligence of the developers.
“Something like the Gradle ecosystem pulls in a lot of stuff at build time, so a manifest file doesn’t actually tell you a whole lot,” she said. “And if you have developers who are not consistently using manifest files or declaring their dependencies — like copying and pasting a dependency in, for example — then you don’t have that as part of your manifest file.”
A project’s dependencies are also likely to rely on other dependencies, forming formidable “dependency trees” that make it difficult to compile a comprehensive list.
“The reality is, when you’re pulling in one direct dependency, you’re probably pulling in a bunch of more indirect dependencies,” Kaczorowski said. “With something like [the npm package manager], the average repo will have nearly 700 indirect dependencies as a result.”
Vulnerability Databases Help Vet Dependencies
Once an SCA tool has a project’s list of dependencies, it scans for dependencies that are out of date or are behind on the latest security patches. This step relies on checking the project’s dependencies against known vulnerabilities — specifically against the National Vulnerability Database (NVD), a database of reported vulnerabilities maintained by the National Institute of Standards and Technology, within the U.S. Department of Commerce.
“If you find a vulnerability in something, you will publish a Common Vulnerability and Exposure (CVE) — like an ID — for your vulnerability,” Kaczorowski said. “Every software vulnerability everywhere that’s discovered has a CVE and is in the National Vulnerability Database.”
Apart from the NVD, all SCA tools also keep their own vulnerability databases, which pull from the national one but also may pull data from other sources. At GitHub, data from the NVD is combined with security information from npm and vulnerabilities reported by users on GitHub.
“GitHub acquired npm, the package manager, and it has information about the security of those packages,” Kaczorowski said. “Another one is the security advisories that are reported on GitHub. When a maintainer has a project that has a vulnerability in it, they can actually publish that information directly on GitHub as part of the project so that their users can see it.”
Kaczorowski said the additional information improves the quality of the data.
“We’ll curate things that are more relevant, verify it in the environment, that type of thing,” she said. “Data quality, that’s mostly what it is.”
Should Developers Update Dependencies Immediately?
Kaczorowski said there are two approaches to updating dependencies.
“It really depends on what you’re optimizing for,” she said. “One is just to be up to date on the latest ... and the other one is to just say, ‘OK, I’m going to manually review everything that I have.’”
When it comes to security updates, there is always the risk that malicious actors could use the vulnerability disclosure as a blueprint to compromise any remaining un-updated applications. Although that’s a good reason to automatically update to the latest, updating immediately can sometimes introduce unwanted changes.
“An issue might be that it has a license that your legal department doesn’t support,” Kaczorowski said. “Something like a copyleft license is a common reason why you might not want to use a dependency if you’re a corporation.”
Copyleft licenses are the opposite of copyright. While copyright gives owners exclusive control over the use and distribution of products, copyleft ensures that the public is free to use the material — and certain products that are derived from the material — however they see fit. If companies are not careful, they may accidentally create products they intended to monetize but cannot because they include copyleft dependencies.
“Something like a copyleft license is a common reason why you might not want to use a dependency if you’re a corporation.”
It’s also possible to accidentally introduce new security vulnerabilities through dependencies, whether the cause is due to mistakes or malicious behavior. GitHub’s State of the Octoverse security report from 2020 found that 17 percent of GitHub’s security advisories were the result of malicious attacks, with the majority originating from npm packages. But the more common dependencies most developers encounter were found to be caused by mistakes.
Developers who want to manually review dependency updates before updating can do so using lock files, which record all of a project’s dependencies, similar to manifest files. But lock files also lock in specific versions of each dependency, which prevent packages from automatically being updated to their latest versions.
Kaczorowski said it’s important not just to mitigate vulnerabilities once dependencies are added to a project, but also to try and prevent them from being added initially.
“We have a feature that we have in beta right now called the dependency review,” Kaczorowski said. “It will let you know when you’re changing your dependencies, if you’re introducing something that’s vulnerable, or something that has a license that may not be compatible with whatever your policy is, so you can see that.”