What Is Fuzz Testing & How Does It Work?

Fuzz testing, or fuzzing, is a software testing technique used to find errors, bugs and vulnerabilities in a computer application. Fuzzing involves the intentional input of invalid or unexpected data (“fuzz”) into a program until it crashes or experiences memory leaks. Developers can then pinpoint what parts of the application’s source code caused these faults and patch them.

Unchecked application vulnerabilities could affect program operations and be exploited by cyber attackers, making fuzz testing a useful technique for cybersecurity purposes.

Fuzz Testing Definition

Fuzz testing, or fuzzing, is a software testing technique that reveals application vulnerabilities by intentionally feeding invalid or unexpected data into a program until it crashes or shows memory leaks. Fuzz testing helps developers find errors, bugs or other security risks that require patching.

How Does Fuzz Testing Work?

Fuzz testing tools, known as fuzzers, work by sending random or semi-random data inputs to an application that is different from what the application wants or expects to receive. This is done so in an effort to crash the application and identify vulnerabilities in input paths that otherwise wouldn’t be noticed by developers on the surface. These vulnerabilities signify areas of code that could be at risk for cyber attacks and other threats, and thus should be patched.

Most vulnerabilities identified by fuzz testing are not actually very difficult to find, though they still can result in significant damage to users and websites if left unattended.

“Not all the technical vulnerabilities are low-level ones,” Ozgur Alp, a security researcher with security testing platform Synack, said. “SQL injections, for example, are really critical vulnerabilities and can be found with automated tests.”

Fuzz Testing Requires Proper Setup

It’s important to understand the parameters of what an application is expecting when generating fuzz testing data.

For those hoping to fuzz test by sending random data to applications with no set strategy, the process can quickly become inefficient. Simply generating random data also often won’t work because applications usually expect inputs to be formatted a certain way, otherwise the input would be rejected.

“It’s an infinite space problem,” said Jonathan Knudsen, senior security strategist at Synopsys, a tech company focused on silicon design and software security. “You can make an infinite number of bad inputs for a piece of software. So, to do effective fuzzing in finite time, you have to pick some subset of those to use as test cases.”

An example Knudsen gave was an application that read in data from text files, but the names of the text files all had to have a certain five-digit code in them for the application to treat them as input. If fuzz testing generated completely random files with random file names, none of the files would even be read by the application.

Some of the most effective data to test with is found by looking at the “boundary conditions” of the data, as Jared DeMott, the CEO of security testing company VDA Labs, said in his course on fuzz testing.

“If you’re thinking about an integer, or a string, maybe you’re only supposed to send a small string, but we send a really big one,” DeMott said. “Or maybe you’re only supposed to send a small number for the number of bytes, but you send a null, or negative or a huge [one].”

Boundary data is great for testing because these are likely the conditions that developers of an application didn’t consider and forgot to handle in the code, and are therefore likely to expose errors.

When Should Fuzz Testing Be Used?

Before you go and fuzz test all your applications, it might be worth it to consider what types of applications can benefit more from fuzz testing.

“This type of testing is pretty specific to code called native code,” DeMott said.

DeMott stated that native programming languages, like C and C++, leave memory management up to the developer, which opens up a lot of opportunities for dangerous memory errors. Meanwhile, higher-level programming languages such as Java and C# handle memory for the developer, so those programs won’t suffer from the same problems.

That said, there are plenty of applications that are written in C and C++ that could benefit from fuzz testing. This includes a lot of programs in defense, automotive and aerospace industries, according to DeMott.

Types of Fuzz Testing

MANUAL FUZZ TESTING

The simplest type of fuzz testing is bombarding an application with completely random inputs.

Although manual testing takes more time, it can find certain vulnerabilities that are hidden from automated testing tools.

Nikhil Srivastava, another Synack security researcher, described one such instance, where he found a problem with race conditions on an e-commerce site that allowed users to enter coupon codes. He specifically tried applying the same coupon multiple times to check if the application would accept it and give the corresponding discount.

“This is where you need a manual plus an automated approach to fuzzing, because you’re asking for the fuzzer to focus on this particular field,” Srivastava said. “Rather than running it on the whole application blindly, where your fuzzer does not know the business logic of the application.”

Automated Fuzz Testing and Types

Automated fuzzers help to catch easily findable and fixable bugs in a shorter amount of time than manual testing. These include bugs that are vulnerable to exploits such as cross-site scripting or server-side request forgery, both of which could send user information to an attacker’s site from the compromised website.

Alp noted companies using automated fuzzers would be able to catch a majority of the technical vulnerabilities he reports for his job. “Twenty to 30 percent of my work is on fuzz testing,” Srivastava said. “I have found a thousand bugs now and I guess 200 or 300 is through automated fuzz testing.”

Mutation-Based Fuzzing

Mutation-based fuzzing utilizes entirely random inputs, though is able to generate new mutations and relevant testing pathways from these random inputs based on existing data.

“The next step up from random is mutational [or mutation-based] fuzzing,” Knudsen said. “You start with a known good input, and then you just mess it up in certain ways to get the test cases.”

Instead of generating a completely random input file, for instance, the tester would take a legitimate input file that the application accepts, then make edits to generate different tests.

“You might go and shorten it, or insert some extra data in the middle, or flip some bits through it,” Knudsen said. “Because you started from a good input, the test cases look pretty much correct except for the place where they’re messed up, and so they’re more believable. So the target software will do more work on them and you’ll be more likely to find bugs in the target software.”

Generation-Based Fuzzing

Another type of fuzz testing, known as generation-based or generational fuzzing, is better than mutation-based fuzzing at creating data at these boundary conditions.

The downside is that generational fuzzers rely on developers to create data templates that the fuzzer then uses to generate test inputs. Knudsen’s company, Synopsys, makes a popular commercial generational fuzzing tool called Defensics.

Coverage-Guided Fuzzing

Coverage-guided fuzzing, also known as guided fuzzing or evolutionary fuzzing, is a type of fuzz testing which combines aspects of both mutation-based and generation-based fuzzers.

“[Guided fuzzers] combine these ideas of randomly sending data and also knowing something about the spec,” DeMott said. “Knowing something about the spec is kind of hard, because it’s expensive to pre-create all these test cases. You have to pay somebody to do that, and they need to know a lot about the protocol.”

Guided fuzzers use code coverage analysis to calculate how well different test cases perform, and they use mutation-based fuzzing to generate more test cases similar to the high-performing ones.

“If you’re covering more and more of the program, you’re likely to uncover more and more of the bugs,” DeMott said.

Advanced Fuzzing

There’s also an “advanced” fuzzer that pulls elements of all three types of fuzzers.

“They combine symbolic execution, which is not just having code coverage and guessing, but actually measuring decision points in the software,” DeMott said.

DeMott recommended that developers use the more advanced types of fuzzers, each of which includes the gains from the previous iterations of fuzzers. However, he noted that the most up-to-date commercial tools may be more expensive, and some fuzzers may only be compatible with certain types of operating systems.

Find out who's hiring.

See all Developer + Engineer jobs at top tech companies & startups

View Jobs

Benefits of Fuzz Testing

CHECKS FOR SECURITY VULNERABILITIES

Bug bounty security researchers, who specialize in finding software vulnerabilities in production code, regularly use fuzz testing as part of their investigative toolkit.

Srivastava said fuzz testing can be used to uncover potential avenues for buffer overflow attacks, where attackers are able to write into adjacent computer memory. For example, if a website has a component asking users to upload audio files, attackers can create their own audio files with fuzzed data to upload and see if they will cause the website to crash. When a program crashes, attackers can then adjust the contents of the input file to test whether the website is susceptible to buffer overflow attacks.

Other types of bugs are even easier to find using fuzz testing techniques.

“I have tested an application a few days back where the whole application goes down just by adding a single value — and I caught that through fuzz testing only,” Srivastava said.

He manually confirmed the bug using the same input value — sure enough, it was still causing the page to crash.

CATCHES ERRORS OTHER TESTING METHODS MISS

There are many types of testing available, such as static application security testing (SAST) and dynamic application security testing (DAST). SAST tools examine application code at rest, scanning for known mistakes that can lead to security vulnerabilities, while DAST tools find bugs by running the application. Fuzz testing is similar to DAST because it checks to see how an application responds when it is running and receiving different inputs, but the errors that each method finds are different.

“DAST is looking for known vulnerabilities, things like SQL injection, cross-site scripting ... not something unknown to it,” said David DeSanto, director of product for security at GitLab.

But there are other types of errors that SAST and DAST tools aren’t able to catch. DeSanto gave an example of a code snippet that allocated a set amount of memory for an array, then proceeded to read beyond the space given to the array. This type of error would crash an application and introduce security vulnerabilities, but it’s not included in the type of code errors that SAST and DAST tools look for. However, it’s easy for fuzzers to find that type of bug.

COMPLEMENTS OTHER TESTING TECHNIQUES

Fuzz testing is only one part of the secure development lifecycle, DeMott noted.

The technique combines with SAST and DAST tools, unit and integration tests that developers write and the workflow and DevOps tools built around them to find security vulnerabilities in code that no single tool would be able to catch.

“All of these testing techniques complement each other, they’re all likely to find things that other approaches miss,” DeMott said.

Find out who's hiring.

See all Developer + Engineer jobs at top tech companies & startups

View Jobs

Limitations of Fuzz Testing

A BAD REPUTATION

While fuzz testing can be an effective, low-effort way of testing applications when configured correctly, it still has a reputation in many circles as a mindless strategy that doesn’t yield significant results.

Barton Miller, a computer science professor at the University of Wisconsin–Madison, said his fuzz testing model hasn’t been without its critics.

“In the process of writing our early fuzz papers, we came across strong resistance from the testing and software engineering community,” Miller writes in Fuzzing for Software Security Testing and Quality Assurance. “The lack of a formal model and methodology and undisciplined approach to testing often offended experienced practitioners in the field.”

DIFFICULT TO SET UP

One of the things preventing greater adoption of fuzz testing among developers is that setup still takes some time.

“On one hand, these tools are getting easier to use all the time,” DeMott said. “But on the other hand, it still takes deep expertise and security and coding to really set up, monitor, manage and triage the bugs that come out.”

MUST BE INTEGRATED INTO EXISTING WORKFLOWS

It’s important to integrate fuzz testing into the DevOps pipeline, and to have proper setup so that test failures are logged. If tests simply fail without logging the input data that caused the error, developers won’t know how to fix the code.

“People who use fuzzing traditionally had to be an expert in fuzzing to use it,” DeSanto said.

If more fuzz testing tools are more easily integrated into developers’ workflows, the practice may become a more widespread and effective part of the testing landscape.

History of Fuzzing

Not all software testing techniques have origin stories, but fuzz testing does: On a stormy evening in 1988, Barton Miller was using a dial-up connection to work remotely on a Unix computer from his apartment. He was attempting to feed input information into a computer program, only to see the program repeatedly crash.

He knew that the electrical noise from the thunderstorm was distorting his inputs into the program as they traveled through the phone line. The distorted inputs were different from what the software needed from the user, resulting in errors. Miller was surprised that even programs he considered robust were crashing as a result of the unexpected input, instead of gracefully handling the error and asking for input again.

Alongside his graduate students at the university, Miller set out to explore the extent of the issue in common computer applications. Their research, conducted over several years, caused program failures across a wide array of Unix, Windows and Macintosh applications by feeding them noisy inputs. He gave their new testing strategy the name “fuzz” testing to “evoke the feeling of random, unstructured data.”

Fuzz testing remains as a decades-old software development practice, and today many open-source and commercial tools are available to help developers incorporate fuzz testing into the software development lifecycle.

What Is Fuzz Testing? How Does It Work?