UPDATED BY
Matthew Urwin | May 16, 2023

Mocks and stubs have long been paired with the practice of writing and running tests in software development, especially when developers conduct unit testing. Both mocks and stubs fall under the umbrella category of test doubles — like stunt doubles, they’re a stand-in for when developers don’t want to use the real thing during tests.

Test doubles can stand in for pretty much anything in code, but that doesn’t mean it’s wise to do so. After all, when everything in a test is a double, there’s nothing substantial in the actual code that’s being tested. Some people aren’t comfortable using test doubles in their tests, arguing that doubles cause problems like making tests brittle and wasting people’s efforts on testing doubles rather than the real system.

But test doubles serve important purposes and different types of doubles set up tests in distinct ways. Developers who know how to use them properly can more easily test for the right things that fit their circumstances.

Below we cover in more detail what mock testing is, what stub testing is and the difference between the two, as well as interview four engineers about their experiences using test doubles.

 

What Is Mock Testing?

Mock testing refers to the process of replacing code with mock code to isolate and test the real code’s behaviors. A mock is a stand-in piece of code that imitates the behaviors of the real code and is ideal for testing the order in which functions are called. Developers can easily create mocks through common libraries like JMock and Mockito. 

The main benefit of mock testing is that it enables developers to test code without having to worry about other variables like traffic flow and network issues. Mock code doesn’t perfectly reproduce the behavior of the real code, but it does give developers an accurate view of how code behaves when unaffected by outside factors.    

 

What Is a Stub?

A stub is also a stand-in piece of code, but it can only imitate the behavior of a service or object. That’s because a stub is programmed to respond to specific inputs and ignore anything else. A stub then produces the same value during each test, so software developers can single out an individual behavior of their code to study. Stubs are also easy to create with tools like JustMock, requiring little code. 

Because of their limited capabilities, stubs are often applied to simpler testing procedures. For example, a stub may be used as a stand-in module to test an application. Developers can determine what problems might arise when introducing the actual module by analyzing the behavior of the stub. They can then make adjustments based on their stub testing and better equip the application for the real module. 

 

Mock vs. Stub: Key Differences Between the Two 

Mocks and stubs are both test doubles that streamline the testing process, but there are distinct features that separate the two. 

Mocks are much more advanced than stubs. While stubs are programmed to produce the same result based on a set of specific inputs, mocks can be programmed to know how many times and in what order functions should be called during testing. Stubs can’t track these details, making mocks ideal for larger, more complex tests. 

Stubs are still better equipped to test certain aspects of software, such as the behavior of a specific piece of code. However, testing one aspect, or unit, of software often involves monitoring other units, which is where mocks excel. As a result, mocks and stubs can be seen as complementary to one another in unit testing. They enable developers to zoom in on certain aspects of software and then understand how different aspects interact with each other.

More on TestingWhat Is Unit Testing?

 

Should Developers Be Choosing Between Using Stubs or Using Mocks?

Steven Solomon, lead software engineer at Stride: One of the elements I want to bring to this conversation is that stubs or mocks are not the only options for test doubles. There’s also fakes, spies and dummies, which all have different design qualities that you can play with.

Stubs are really good when you’re setting up preconditions that you want to be true in a testing scenario, like if I have a checkout cart logic that I’m trying to test. A stub would be something like making sure that a cart repository returns a specific set of products while mocks are more about verifying that some action has been taken. Using that same scenario, after someone completes the checkout logic, making sure that an email goes out to the customer showing the total of their cart and the products in that cart — that would be something you would mock.

Brian Riley, principal software engineer at Carbon Five: Stubs and mocks are different things. There are some libraries that don’t make the distinction, but traditionally they are different things. Stubs are objects where, if someone calls this method on you, this is what you’re going to respond with — and that’s it. It’s a stand-in for some other object. 

So an email service is actually a great example of this. The email service might make a call out to a third-party service and you don’t want that to get hit during tests. So what you would do is you would create a stub that says, “When the Send method gets called on this email service, I respond with: ‘OK, it was sent.’”

There’s another thing called a spy, which will record the calls that are made to it, along with all of the arguments that get passed in. So in the case of the email service, you set up your email service spy, then somebody could call the Send method and then you do your test. Afterwards, you can ask, “How many times was the Send method called on this? And what were the arguments that were sent to it?”

 

What Do Test Doubles Have to Do With State or Behavior Verification?

Solomon: It’s about asserting state versus asserting behavior and those are two different ways to look at the same thing, both of which have their merits. There are two schools of thought — there’s “classicist” and “mockist” styles, or what’s commonly referred to as London style because it emerged out of London. And that style would use mocks or stubs. And the classicist style would be more about asserting whether a specific state has occurred. So you wouldn’t actually test that an email service is called — instead, you would go look in a Kafka queue and see that there’s some message there that will eventually get picked up by a worker that says, “This card was purchased by this customer.”

Conrad Benham, principal consultant at Stride: Steven hit the nail on the head there regarding the separation between classicist versus mockist. So if I’m test-driving a service, the service may need some other service to interact with. So I would test-drive the logic of service A and then mock service B. And then I would set up expectations on the mock of service B to state, “OK, I’m expecting a call with these arguments,” and then verify that it occurred. But the email object that was passed in doesn’t get mocked. It’s got a state, it’s got its representation.

Riley: The distinction is often described as state verification and behavior verification. So with the stub, you do some stuff to it and then you ask, “What’s your current state? How many times were you called? What is your state?”

With a mock, you’re trying to test more of the behavior side of things. So you’re saying, “These are the things I expect to happen to the subject.” It’s established before you actually run the test, you’re going to set up all these expectations. And then when you run the tests, it’ll figure it out.

With the [classicist] school, more often than not, they’re going to reach for stubs and spies, not mocks — and they’re only doing that when things are just a little bit too awkward to set up.

 

Mock vs. Stub: When to Use Each One 

Benham: I have a very strong rule of not mocking any domain objects or any model objects — that is a very, very strict rule that I follow. The moment you start doing that, that’s when you get into trouble. The mocks should be services, so things that have data flowing through them on the stack, but nothing that lives in the heap gets mocked. Contravening that will land you in a world of pain.

Let’s say you want to send an email. You may have an email object that has a sender and recipients, CC recipients and BCC recipients, in addition to subject, the body and you may also have attachments — that would be the main object because it’s capturing the essence of what’s to be sent. And then the services that support the sending of that would be things such as a messaging queue. But the domain object there is the email object. And then your supporting services, which I would mock, would be the service that takes that email object and then figures out what to do with it. So I wouldn’t mock that email object.

Riley: If you’re going to go the mock route, then beforehand, when you’re setting up your fake email service, you are specifying: “These are the methods that are going to get called, with these arguments, in this order.” And it will complain if that does not happen.

I have to set up all these tests against the internals of the implementation — with the mock style, you’re digging into the implementation a lot. If you’re doing the mock style, it’s very easy to get in a situation where you’re defining, “OK, warehouse gets called first and then email next.” And that might be kind of annoying because you’re really just trying to test the waters — or maybe not! Both sides are kind of right. Maybe you do want to test that these things get called in this order and that’s exactly where you would use the mock style.

Find out who's hiring.
See all Developer + Engineer jobs at top tech companies & startups
View 9552 Jobs

 

Can Using Mocks or Stubs Lead to Fragile Tests?

Madelyn Freed, full-stack engineer at Stride: I don’t like to use mocks within my domain — I only really mock things on the boundaries of my domain, if I’m calling out to somebody else. Because once you introduce mocks, you basically are making your tests have to know the names of the things that they’re dependent on. And I just don’t think that’s taking advantage of the full benefit of tests, which is making sure everything works, and allowing you the flexibility to redesign something quickly with a lot of confidence that it really actually works, because you have every code path covered. But you don’t have proper names embedded in your tests that you then have to change.

Riley: Take the case of a banking app where the project managers come along and say, “OK, now we want to see the last 30 days worth of transactions.” 

You probably already have a test section, so you’d expand it and assert that you’re seeing those specific transactions within the 30-day range — that’s your high-level, feature-level test. So you’ve got that, and then you would go into the unit testing world — now you’re testing individual objects.

This is where things like code design come into play because it’s really easy to get into a situation where your unit testing class depends on 17 other classes, so now you’re not just exercising this one class, you’re actually exercising all of them. That’s where maybe a bug in one of the collaborator objects causes your test to fail.

More on TestingHow Integration Testing Builds Confidence in Code

 

What Potential Problems Should You Watch Out for When Using Stubs and Mocks?

Freed: When I’m starting something new, my mocking approach is more “inside-outside.” What I am writing and designing is “inside,” and I don’t want to mock anything there, and everything else — third parties, other people’s systems — I want to keep some sort of boundary. When I’m unit testing, trying to enforce small units as soon as possible can sometimes lock in an interface that you actually aren’t attached to — so I prefer starting at a more integration level.

Does the whole thing work from top to bottom? If so, and that interface remains working, and every code path is covered on an integration level, then just go wild with the rest of your design — not unit testing, or mocking any other interfaces you might make, even if you’re splitting things off into new classes because you’re still experimenting with how the system will go and you don’t want to lock in something that’s not your main interface. 

That’s how I approach it — keep an integration test that fully covers every code pass. And that’s the only thing I make sure it keeps passing for a long time. And I won’t introduce anything new, I’m just basically refactoring until I’ve kind of solidified the other classes that I’ve made and hopefully resulted in the design that I think best models things.

Riley: The problem that you can run into with mocks is if you do mock out one of those collaborators and then the collaborative changes, you could get a false positive in your test — the test is going to pass because it’s using a mock that is behaving in a specific way. From the test perspective, everything works fine, but the underlying code that the mock represents might have changed and the test isn’t catching that. So that’s one of the things you’ve got to be careful with.

There is a certain discipline that you have to take with it. People can easily get themselves into situations where they’re mocking everything out and false positives can happen. And then they get really frustrated and they’re like, “Mocks are terrible.” But I think it more points to a lack of discipline with how you’re using them.

You have to ask yourself, how am I going to know if the code underneath changes? So you do want to make sure you’ve got other tests there around that kind of stuff. That’s where the higher-level tests can help because it’s making sure that everything is wired up correctly together.

Great Companies Need Great People. That's Where We Come In.

Recruit With Us