A Matter of Interpretation: Mock Objects - The Code Whisperer

In the middle of a discussion about test doubles/mock objects and their role in design, someone posted this:


This is a companion discussion topic for the original entry at https://blog.thecodewhisperer.com/permalink/a-matter-of-interpretation-with-mock-objects

Imagine that you were going to buy a car and you had a choice between two brands:

  • Brand A has tested all of the car’s individual components in isolation. The tires, the gears inside the engine, the steering wheel, etc. All the small components have dedicated unit tests and all dependencies are mocked inside each test. Brand A has no tests where all the components are put together as a whole.

  • Brand B has blackbox tests like “The car can start”, “The car can stop”, “The car can turn to the left”, “The car can turn to the right” etc. All of the tests are done on an actual car.

Which car brand would you choose?

@jbrains, I believe that cited person meant Mocks specifically, in the sense that you explicitly define their behavior in test setup (eg. what Mockito offers in Java world), contrasting it to other test doubles.

I share that sentiment. I eg. prefer to use hash map backed implementation of repository over creating a mock and setting up expectations.

Ironically (?) I was challenged few times on using stub implementation like described above instead of mocking, and my answer was what you explain in article: you wouldn’t make a list = mock(List.class) where you could just use list = new ArrayList() in your test.

Hi, @Anders. Fortunately, we don’t have to ever make that choice, so my answer would be meaningless, anyway. Let me propose a choice closer to what tends actually to happen.

  • Brand A has spent 1000 hours testing the entire car on the road in a variety of driving conditions. They also run all their individual components through a full battery of tests every week and every month they add a few more tests to their standard test suite for the more-complicated components.

  • Brand B has spent 1000 hours testing the entire car on the road in a variety of driving conditions. They’ve also spent about 5000 hours testing the entire car only in their test track, under predictable-but-realistic driving conditions that rarely change.

Which car brand would you choose?

Keep in mind that this are magic cars that can be reliably upgraded while they sit in your driveway and you sleep. Brand A can ship fixes overnight about 85% of the time with little risk of creating new problems and can show you which tests they ran that convinced the entire design team that 96% of the components of the car were unaffected by the changes. Brand B needs at least 50 hours of driving testing in order to give the design team just enough confidence to release improvements to the car, usually prefers at least 200 hours of driving testing, and although they feel confident that 96% of the components of the car were unaffected by the changes, you just have to trust them, because they have trouble explaining where their confidence comes from.

(This analogy has many problems, including the fact that the cost of failure for getting this wrong on a car is so high that we would shy away from Brand A, but most of the software systems we build don’t carry anything near the cost of failure of a failing car. Human reasoning generally performs extremely poorly in conditions of significant fear of personal harm, so those are not the conditions under which I’d trust the average programmer’s reasoning about which tests to write and how much confident to have in the correctness of their code.)

This article nowhere concludes “therefore never use integrated tests”, as your comment implied. It merely concludes that “because it’s more natural/feels better” (1) doesn’t represent a strong argument to rely on integrated tests and (2) isn’t even how programmers actually feel about the situation in general.