Mocks and stubs have long been paired with the practice of writing and running tests in software development, especially when developers conduct unit testing. Both mocks and stubs fall under the umbrella category of test doubles — like stunt doubles, they’re a stand-in for when developers don’t want to use the real thing during tests.
Mocks vs. Stubs: What Are the Differences?
- Both mocks and stubs are test doubles, which are code constructs used during software testing to stand in for actual objects and services.
- Mocks verify the behavior of the code you’re testing, also known as the system under test. Mocks should be used when you want to test the order in which functions are called.
- Stubs verify the state of the system under test. Stubs don’t take order into account, which can be helpful for reducing the work of rewriting tests when code is refactored.
Test doubles can stand in for pretty much anything in code, but that doesn’t mean it’s wise to do so. After all, when everything in a test is a double, there’s nothing substantial in the actual code that’s being tested. Some people aren’t comfortable using test doubles in their tests, arguing that doubles cause problems like making tests brittle and wasting people’s efforts on testing doubles rather than the real system.
But test doubles serve important purposes and different types of doubles set up tests in distinct ways. Developers who know how to use them properly can more easily test for the right things that fit their circumstances.
Four developers talked with Built In about their experiences using test doubles. Steven Solomon is a lead software engineer and engineering manager at Stride Consulting, Brian Riley is a principal software engineer at Carbon Five, Conrad Benham is a principal consultant at Stride and Madelyn Freed is a full-stack engineer at Stride.
Should Developers Be Choosing Between Using Stubs or Using Mocks?
Steven Solomon: One of the elements I want to bring to this conversation is that stubs or mocks are not the only options for test doubles. There’s also fakes, spies and dummies, which all have different design qualities that you can play with.
Stubs are really good when you’re setting up preconditions that you want to be true in a testing scenario, like if I have a checkout cart logic that I’m trying to test. A stub would be something like making sure that a cart repository returns a specific set of products while mocks are more about verifying that some action has been taken. Using that same scenario, after someone completes the checkout logic, making sure that an email goes out to the customer showing the total of their cart and the products in that cart — that would be something you would mock.
Brian Riley: Stubs and mocks are different things. There are some libraries that don’t make the distinction, but traditionally they are different things. Stubs are objects where, if someone calls this method on you, this is what you’re going to respond with — and that’s it. It’s a stand-in for some other object.
So an email service is actually a great example of this. The email service might make a call out to a third-party service and you don’t want that to get hit during tests. So what you would do is you would create a stub that says, “When the Send method gets called on this email service, I respond with: ‘OK, it was sent.’”
There’s another thing called a spy, which will record the calls that are made to it, along with all of the arguments that get passed in. So in the case of the email service, you set up your email service spy, then somebody could call the Send method and then you do your test. Afterwards, you can ask, “How many times was the Send method called on this? And what were the arguments that were sent to it?”
What Do Test Doubles Have to Do With State or Behavior Verification?
Steven Solomon: It’s about asserting state versus asserting behavior and those are two different ways to look at the same thing, both of which have their merits. There are two schools of thought — there’s “classicist” and “mockist” styles, or what’s commonly referred to as London style because it emerged out of London. And that style would use mocks or stubs. And the classicist style would be more about asserting whether a specific state has occurred. So you wouldn’t actually test that an email service is called — instead, you would go look in a Kafka queue and see that there’s some message there that will eventually get picked up by a worker that says, “This card was purchased by this customer.”
Conrad Benham: Steven hit the nail on the head there regarding the separation between classicist versus mockist. So if I’m test-driving a service, the service may need some other service to interact with. So I would test-drive the logic of service A and then mock service B. And then I would set up expectations on the mock of service B to state, “OK, I’m expecting a call with these arguments,” and then verify that it occurred. But the email object that was passed in doesn’t get mocked. It’s got a state, it’s got its representation.
Brian Riley: The distinction is often described as state verification and behavior verification. So with the stub, you do some stuff to it and then you ask, “What’s your current state? How many times were you called? What is your state?”
With a mock, you’re trying to test more of the behavior side of things. So you’re saying, “These are the things I expect to happen to the subject.” It’s established before you actually run the test, you’re going to set up all these expectations. And then when you run the tests, it’ll figure it out.
With the [classicist] school, more often than not, they’re going to reach for stubs and spies, not mocks — and they’re only doing that when things are just a little bit too awkward to set up.
Are There Situations Where You Should Probably Use Mocks Over Stubs or Stubs Over Mocks?
Conrad Benham: I have a very strong rule of not mocking any domain objects or any model objects — that is a very, very strict rule that I follow. The moment you start doing that, that’s when you get into trouble. The mocks should be services, so things that have data flowing through them on the stack, but nothing that lives in the heap gets mocked. Contravening that will land you in a world of pain.
Let’s say you want to send an email. You may have an email object that has a sender and recipients, CC recipients and BCC recipients, in addition to subject, the body and you may also have attachments — that would be the main object because it’s capturing the essence of what’s to be sent. And then the services that support the sending of that would be things such as a messaging queue. But the domain object there is the email object. And then your supporting services, which I would mock, would be the service that takes that email object and then figures out what to do with it. So I wouldn’t mock that email object.
Brian Riley: If you’re going to go the mock route, then beforehand, when you’re setting up your fake email service, you are specifying: “These are the methods that are going to get called, with these arguments, in this order.” And it will complain if that does not happen.
I have to set up all these tests against the internals of the implementation — with the mock style, you’re digging into the implementation a lot. If you’re doing the mock style, it’s very easy to get in a situation where you’re defining, “OK, warehouse gets called first and then email next.” And that might be kind of annoying because you’re really just trying to test the waters — or maybe not! Both sides are kind of right. Maybe you do want to test that these things get called in this order and that’s exactly where you would use the mock style.
Can Using Mocks or Stubs Lead to Fragile Tests?
Madelyn Freed: I don’t like to use mocks within my domain — I only really mock things on the boundaries of my domain, if I’m calling out to somebody else. Because once you introduce mocks, you basically are making your tests have to know the names of the things that they’re dependent on. And I just don’t think that’s taking advantage of the full benefit of tests, which is making sure everything works, and allowing you the flexibility to redesign something quickly with a lot of confidence that it really actually works, because you have every code path covered. But you don’t have proper names embedded in your tests that you then have to change.
Brian Riley: Take the case of a banking app where the project managers come along and say, “OK, now we want to see the last 30 days worth of transactions.”
You probably already have a test section, so you’d expand it and assert that you’re seeing those specific transactions within the 30-day range — that’s your high-level, feature-level test. So you’ve got that, and then you would go into the unit testing world — now you’re testing individual objects.
This is where things like code design come into play because it’s really easy to get into a situation where your unit testing class depends on 17 other classes, so now you’re not just exercising this one class, you’re actually exercising all of them. That’s where maybe a bug in one of the collaborator objects causes your test to fail.
What Potential Problems Should You Watch Out for When Using Stubs and Mocks?
Madelyn Freed: When I’m starting something new, my mocking approach is more “inside-outside.” What I am writing and designing is “inside,” and I don’t want to mock anything there, and everything else — third parties, other people’s systems — I want to keep some sort of boundary. When I’m unit testing, trying to enforce small units as soon as possible can sometimes lock in an interface that you actually aren’t attached to — so I prefer starting at a more integration level.
Does the whole thing work from top to bottom? If so, and that interface remains working, and every code path is covered on an integration level, then just go wild with the rest of your design — not unit testing, or mocking any other interfaces you might make, even if you’re splitting things off into new classes because you’re still experimenting with how the system will go and you don’t want to lock in something that’s not your main interface.
That’s how I approach it — keep an integration test that fully covers every code pass. And that’s the only thing I make sure it keeps passing for a long time. And I won’t introduce anything new, I’m just basically refactoring until I’ve kind of solidified the other classes that I’ve made and hopefully resulted in the design that I think best models things.
Brian Riley: The problem that you can run into with mocks is if you do mock out one of those collaborators and then the collaborative changes, you could get a false positive in your test — the test is going to pass because it’s using a mock that is behaving in a specific way. From the test perspective, everything works fine, but the underlying code that the mock represents might have changed and the test isn’t catching that. So that’s one of the things you’ve got to be careful with.
There is a certain discipline that you have to take with it. People can easily get themselves into situations where they’re mocking everything out and false positives can happen. And then they get really frustrated and they’re like, “Mocks are terrible.” But I think it more points to a lack of discipline with how you’re using them.
You have to ask yourself, how am I going to know if the code underneath changes? So you do want to make sure you’ve got other tests there around that kind of stuff. That’s where the higher-level tests can help because it’s making sure that everything is wired up correctly together.
Responses have been edited for length and clarity.