Integrated Tests Are A Scam

I understand your experience, I've seen dozens (hundreds?) of other people experience it, but I interpret it differently than you (and many of them) do. In particular, I would add something crucial to your last sentence: "Unit tests (sic) protect your code from change _if you don't pay attention to what they're trying to do tell you_."

I interpret the problem you describe as test doubles doing their job, alerting me to accidental dependencies in my code. Many programmers react to this situation as "the test doubles are killing me", but I disagree: your design is killing you and the test doubles cause you to feel the pain. If you try to live with the pain long enough, then eventually it becomes debilitating, and you end up exactly spending hours "fixing" tests but not fixing problems. What you have is indirection without (enough/suitable) abstraction. More-suitable abstraction helps me create useful, _stable_, hard boundaries between components. These boundaries become so stable that I _want_ to hear some alarm blaring somewhere when one tries to change it, and test doubles act as that alarm.

Sometimes we improve the design by applying the DIP to remove an abstraction, replacing a Supplier with the data it supplies. This removes an abstraction that might have been helpful in the past, but is no longer. I provide a simple example that illustrates this case in the article: https://blog.thecodewhisper...

Sometimes we improve the design by recognizing that excessive mocking (the kind that you probably have in your head) typically highlights duplication that, if removed, leaves behind an abstraction that marks a now-stable boundary between components. I don't have an article with a simple example of this, but it will become part of my next online training course, when I finally get around to producing it.

Yes, integrated (not integration!) tests can protect your code, but at a cost that rises super-exponentially (O(n!)) over time. This is the great think about junk drawers: they're really convenient until suddenly you can't find the one damn thing you're looking for. When that happens, create a new drawer dedicated to that kind of thing and put that thing back in it. That's what I'm recommending here.

Why "integrated" over "integration" in this context? https://blog.thecodewhisper...

As for "you broke it", I don't find that quite as universally true as you seem to. When an integrated tests fails, it often points to something that lies entirely outside my control. Did I break it? Maybe. Maybe not. Maybe someone else broke something. Should I depend directly on that thing that someone else controls? Maybe not. If I apply DIP there, then I create indirection so that I can check my side while making reasonable assumptions about their side; and if that test becomes unwieldy, then I introduce a more-suitable abstraction. Matteo Vaccari wrote a nice example of this so that I didn't have to: https://medium.com/@xpmatte...

When sit-ups hurt, let's not assume that the sit-ups are the problem, because more often the problem lies in the fat around our stomachs or our weak abdominal muscles. Test doubles are the sit-ups and the hardwired dependencies on concrete things are the fat/weak muscles.

I agree with Wesley,this article is mis guiding .Mainly with Agile project Cycle and short releases also with most companies moving away from Monolithic ESB gateway to more kind of micro services architecture where services deployed into several component level groups and contracts are tested individually to make sure component level ,unit level tests works .
Apart from this when those individual components are build together we need at least some level of integration tests either with Mock or without mocked data to tests the components together .So service to service level integration tests are very important in business.

I find this comment very interesting! You say that my article "is misguiding", but: (1) you say that companies using microservices architecture are testing components individually and checking contracts between components; and (2) you say that we need "integration tests with mock or without mocked data" to tests the components together.

So where is the misguiding part?

In point 1, you say that companies are doing exactly the thing that my article encourages them to do: to check components individually and to check carefully the contracts between components. I have noticed, too, that especially since Martin Fowler's article on Consumer-Driven Contract Testing, more companies are taking that advice seriously. They use tools like Pact to document and check the published behaviors of distributed services one by one, rather than all (or many) at once. I'm saying the same thing, except that I also do it within services: I think of it as "microservices in a single process".

In point 2, you say that we need "integration tests with mock". Yes! An "integration test with mocks" is exactly the same thing as the collaboration tests that I'm talking about here. An integration test (check the points of integration using something like mock objects without running the whole system together) and not an integrated test (connecting the production implementations together). This confusion is why I stopped saying "integration tests are a scam" in 2010. See https://blog.thecodewhisper...

I suspect that you and I agree quite a lot.

mm..yeah i am more clear.I think you need to change the title integration tests are a scam,In my company each ops review we have action item we need to increase integration level tests.So may be you could reword the words a bit.Thanks for the explanation. Even unit tests dont test permuations they only test the logic .TDD is a good way test the code before even writing it fully.But my experience writing 100% TDD tests are good for advise but with 2 weeks sprints it is really challenging to achieve 100 percent intergration testing

I changed the title in 2010, because the phrase "integration tests" was not accurate. I changed it to "integrated tests" to clarify my position as I have explained it to you.

As you say, it is challenging to achieve 100% integrated testing. (I assume that you mean this.) I agree, so I don't try to achieve it. Instead, I verify that each layer can talk to the next layer correctly using special kinds of unit tests. I can do this at every layer from the beginning of the system to the end. I have been writing about these techniques and teaching them for 15 years. It is difficult to explain in only one paragraph. :)

I got to collaborator tests (test roles) from pursuing a great software architecture.
I had always ignored frameworks like spock and the like but now I am blown away on how many things *click* with this when your architecture is well designed.

I can understand why this tests can tell you if your code is not well constructed and I find a little disappointing that advocates of collaborator tests aren't talking more about Object Oriented Arquitecture.

I invite you to hear from Jakub Nabrdalik he shares his implementation of
Hexagonal Architecure https://www.youtube.com/wat...
Behavior Driven Development https://www.youtube.com/wat...

Those two concepts should be discussed together more often (at least on OO languages)

Thank you for these references!

Yes, I teach in my training about how architecture improves when we focus on collaboration and contract tests. We can start from both sides: strong architecture towards nice tests or nice tests towards strong architecture. I like to teach it from nice tests towards strong architecture so that "good design" seems less mysterious and easier to achieve.

The book _Growing Object-Oriented Software: Guided By Tests_ provides the first long-form discussion of this connexion. It influenced my thinking a lot.

The article talks specifically about case where someone has written integration tests to cover all functional logic.

Unit tests would not cover infra, configurations, deployment issues and validate our assumptions about external system (that we make white writing the stubs).

How can we trust our CI/CD pipelines without validating our assumptions about API contracts ? We definitely will need some “system test” or “end-to-end test”.

I think this article gives a more pragmatic look on why we need integration test https://martinfowler.com/bl... and how contract tests might be the solution you are looking for if your data comes from external system : https://martinfowler.com/bl...

Yes. This article acted as the manifesto and I never intended for it to describe the entire picture. I wrote it purely to draw attention to the problem as I saw it. As the years go on, I describe how I limit integrated tests to the boundaries of the system. I do this more aggressively than most people.

Even so, we don't need end-to-end tests _of our entire system_ to validate assumptions about dependencies on external systems. I avoid this by limiting integrated tests to the boundary and relentlessly removing duplication at that boundary. This results in a minimum of integrated tests. That itself isn't always my goal, but I'm glad that I know exactly how to do it when I need it.

More articles in this series reach similar conclusions to the ones that Martin does.

@jbrains:disqus I understand the exponential nature of branches to the power of the layers. What I don't understand is how this test methodology reduces the number of tests you have to write. The way you describe it in the video doesn't seem to decrease the number of tests more than marginally, when I draw out a tree of branching pathways.

The main benefit of mocking/stubbing of tests that I see, is that you're testing closely to the component, rather than *needing* to consider all the implications of all the pathways underneath that component. i.e. you're more likely to think of all the potential use cases when the testing is narrowly focused on just that component.

In essence, two things happen:

1. products turn into sums
2. an object in the tree never knows about objects lower in the tree than its (direct) children (because interfaces hide those details)

If you already do (2) quite well, then it matters less. If trees of collaborating objects have a height of no more than 3, then the "combinatoric explosion of tests" argument loses its teeth, and instead the arguments for locality of change take over.

Interesting, I didn't do very many layers in my trees.

Have you seen any of Vladimir Khorikov's stuff? His views seem to be diametrically opposed to yours. He strongly believes in leaving real objects in the dependency tree, with the exception of volatile dependencies (databases, email, other systems, etc). He also thinks mocks should almost never, if ever, be used. In reality, he creates his own mocks for re-use purposes, but if re-use isn't necessary I find that a simple mocking library like mockito is pretty succinct.

p.s.
I did note you wrote an article (don't recall which one) indicating that the need for mocks can be an indicator of badly designed code; which I too have found. But that doesn't mean you don't ever use them, and instead write classes to handle the mocking cases.

I haven't seen Vladimir's work, but based on your description, I don't think we disagree at all. What you/he label "volatile" dependencies sound like Service implementation dependencies ("Service" in the DDD sense) as opposed to Values (again in the DDD sense). In essence, I freely put interfaces (and therefore test doubles) in front of Services and rarely in front of Values. We might differ in intent or framing, but otherwise it sounds pretty similar.

You might be referring to "Beyond Mock Objects" where I show the refactoring from a function that talks to a supplier of a value to a function that simply accepts the value as a parameter, which eliminates a stub (by eliminating a possibly unnecessary level of indirection). I tend not to think of test doubles as a sign of a design risk--it depends more on the presence of irrelevant details in a test, and needing to stub or mock something can be irrelevant just as the implementation details of a collaborator can be irrelevant.

I think what @trentondadams is alluding to is that Vladimir Khorikov and Eric Elliot, just to name a few, adhere to the Classic school, and you and Mark Seemann to the London school. But hey, don’t quote me on that, I don’t mean to put words on other people’s mouth, I just try to learn from everything you guys write about.

It’s an age old debate, with many nuances, such as your focus on contract tests. The classic approach think it is a good thing not to mock a pure call to another function because you’re creating a safety net the more you call it.

Both schools favors the test pyramid, so not having unit tests its not what is at stake here. Unit tests drive quality code. It’s how to test unit composition that is more of a hot topic.

First of all: broadly yes, with a few precise bits to add.

Indeed, I don’t adhere to the London school, although I definitely studied there extensively. :slight_smile: Moreover, I think there’s a misconception about the London school: it suggests using test doubles freely to check interactions with side effects. This is different from “test doubles are good, so use more of them”, which many people seem to think they are claiming.

As I interpret things, the London school tolerates side-effects and suggests testing them with test doubles instead of by duplicating tests at ever-bigger scales of the system. It doesn’t demand inverting dependencies on side-effects in the way that the FP style seems to do. It doesn’t give in to integrated tests in situations where we decide to leave the side-effect where it is. The Classic/Chicago/Detroit school seems to tolerate duplication in tests at different scales of the system while eliminating test doubles, whereas the London school tolerates test doubles while eliminating duplication in tests at different scales of the system.

As you say, both schools prefer microtests. They merely favor/teach/recommend different coping strategies.

My current position is this: You Don't Hate Mocks; You Hate Side-Effects - The Code Whisperer

Broadly speaking, I don’t mind injecting an abstraction to insulate the Client against a problematic side-effect, although when I do this, I often eventually invert the dependency, moving the side-effect up and (more) out of the way. This has the result of gradually reducing the number of test doubles in my tests, even though I never have the explicit goal of eliminating (all) test doubles.

Even so, I recognize problems with certain tests containing test doubles, such as too many expectations or stubbing a function to return another stub to return a desired value. I interpret those as design risks and tend to refactor to eliminate them. However, when a test stubs 1 query and expects 1 action, I don’t rush to refactor this to a Logic Sandwich. I feel comfortable doing that when it becomes helpful in support of some other way of improving the design.

1 Like