Your Constructors are Completely Irrational - The Code Whisperer - Articles

March 2015

michael_donaghy

Nope. A constructor needs to put an object in a valid state, which it should really only have one of - if your object might be in one of several different states, you should probably be using polymorphism and instead have something that might be one of several different objects. Constructor code has much stronger guarantees than arbitrary code, which make it much easier to reason about - a constructor is guaranteed to be called exactly once, and is compile-time guaranteed to either initialize all final fields or throw an exception. In a well-factored Java codebase, most of the code should be in constructors.

Of course that's no excuse for not cleanly separating concerns - but most of the time an overly couples constructor reflects an overly coupled object, and the correct solution is not to move the code into a static (which means you have much weaker guarantees about object state and need much more testing), but to split the object into smaller objects, each with their own responsibilities handled by their own constructor.

"Factory" is a perfectly good way to describe a factory method. Words have meanings and the word "factory" is an ordinary English word with an ordinary English meaning. Attempting to impose a different meaning on the word is doomed to failure; if you need to talk about something more specific, come up with a word that means that.

1 reply

March 2015 ▶ michael_donaghy

jbrains

"In a well-factored Java codebase, most of the code should be in constructors." This stands out as strange to me, rather in the extreme. By chance, can you point to an example? It seems to me that constructors doing most of the work in a system leads to an excess of hardwired dependencies that makes for a terrifying case of ripple effect everywhere all the time. I'd hate to try to test any part of that system on its own, except perhaps the very bottom layer. I'm happy to see a counterexample, because that would really change my worldview.

"A constructor needs to put an object in a valid state, which it should really only have one of - if your object might be in one of several different states, you should probably be using polymorphism and instead have something that might be one of several different objects." Yes. I don't understand why this sentence followed "Nope", seeing as we agree on this and I made the first part of this point directly and, I thought, clearly in the article. Do you think we disagree on this point? If so, why?

I don't understand how performing instantiation in a class-level function or a method affects the amount of testing to do. Can you explain this point? For example, I can see how we'd eventually move Game.newGame() onto some kind of GameConfiguration or GameEngine, renaming newGame() to start() (and then we could add pause(), resume(), save(), load()...). Let's say I did that now: I move Game.newGame() to a new method and invoke it as new GameEngine(...).startGame(...). How does this save me any effort in testing? How does this change the coupling with Game's constructor? How does this make anything any more coupled to any part of Game? I don't see what you mean here at all. It looks perfectly isomorphic to me.

Regarding "Factory", yeah, yeah... no matter what word we choose, you can find one group of people who believe that clear communication matters and we should use words consistently, so they don't like that "Abstract Factory" and "Factory Method" use the same word to mean different things -- and then you can find the other group of people whose Harley gets 40 rods to the hogshead and that's the way they likes it. I understand that with regard to this word, you're used to the extra cognitive load and don't mind. I prefer assertEquals(expected, actual) to assertThat(actual, is(equalTo(expected))) and some people consider me a dinosaur for that. That's fine. That's just a point of personal preference.

1 reply

March 2015

Snagglepusbucket

The trouble I have is that when everything has to be completely abstract, and the expert-recommended architecture is to constantly abstract, abstract, and abstract, delegate, delegate, and delegate, down, down, and down into ever-deeper layers of maybe slightly less abstract code, how, and at what point, do you decide where to draw the line and finally bottom out into something concrete? Seems to me it could threaten to become infinitely deep, and even rise that don't usually go too far.

Most code I've seen that the "experts" would consider acceptably abstracted is nearly impossible to comprehend. The abstractions often bear no obvious relation to the ostensible purpose of the code, at least not until you've read-and-understood more layers than I, for one, can keep in my head while waiting to reach bottom and discover what actually happens. And then, by the time I do reach the concrete code, often the higher-level operations have been broken down into such tiny pieces that few, if any, of the concrete methods/functions contain *enough* code to provide sufficient context to tell what *they* do, either.

So in my opinion, extreme OO/abstraction can play hell with readability, this understanding, and thus using, these code architectures that experts praise. What good is beautiful code, if the would-be user's reaction is "screw it, I'll write my own?" Your priorities seem misplaced.

1 reply

March 2015 ▶ jbrains

michael_donaghy

> "In a well-factored Java codebase, most of the code should be in constructors." This stands out as strange to me, rather in the extreme. By chance, can you point to an example? It seems to me that constructors doing most of the work in a system leads to an excess of hardwired dependencies that makes for a terrifying case of ripple effect everywhere all the time. I'd hate to try to test any part of that system on its own, except perhaps the very bottom layer. I'm happy to see a counterexample, because that would really change my worldview.

The clearest example I've found of this is in Wicket UIs, where you tend to end up with a parallel hierarchy of components that correspond to datatypes. E.g. you might have a UserEditorPanel that contains a NameEditorField and an AddressEditorPanel and the latter includes various fields. So you do new UserEditorPanel(user), and that constructor in turn calls new AddressEditorPanel(user.getAddress()) and so on. Everything does ripple up and down, but in a structured way that corresponds directly to the domain model.

You're right that this does push one away from doing conventional unit testing where one would mock UserEditorPanel (although this is largely just an artefact of Java rather than a fundamental thing, and you can use e.g. PowerMock if you really need to). But I'm coming around to a position that those kind of "middle" unit tests aren't actually as valuable as the more integrated tests that you get from my approach. Think about it in bottom-up terms: if AddressEditorPanel passes all its tests then we're happy it behaves as it's meant to. So then if UserEditorPanel fails a test, even if that test ran against the real (non-mock) AddressEditorPanel, we know the problem is in UserEditorPanel not in AddressEditorPanel. And we're less likely to get the kind of subtle bugs that happen when your mock AddressEditorPanel behaves slightly differently from the real one.

> "A constructor needs to put an object in a valid state, which it should really only have one of - if your object might be in one of several different states, you should probably be using polymorphism and instead have something that might be one of several different objects." Yes. I don't understand why this sentence followed "Nope", seeing as we agree on this and I made the first part of this point directly and, I thought, clearly in the article. Do you think we disagree on this point? If so, why?

I think we disagree because of your article's example. As originally written, the class has an invariant that the list always contains 50 Pop Questions, etc. (at least assuming the class doesn't allow outsiders to modify the lists, which it shouldn't - it would be better if it used guava's `ImmutableList` and declared the list as `final`). It's part of the game's valid state that it has those, and it's impossible to ever construct a `Game` that doesn't have those 50 questions in. By moving the adding the questions into a static factory method, you give the object more possible states, some of them invalid, and you will need to e.g. test whether the `Game` returned from some higher-level function has those 50 questions or not (a non-static factory method would at least allow you to mock/verify the correct call without needing PowerMock, though that's fairly marginal).

On re-reading the article, maybe the `Game` example was supposed to have multiple constructors? In which case I think that's the real problem, and the right solution is to split the class into `RealGame` and `TestGame` which implement the same interface, and `TestGame`'s constructor should add those test questions and `RealGame`'s constructor should not.

> Regarding "Factory", yeah, yeah... no matter what word we choose, you can find one group of people who believe that clear communication matters and we should use words consistently, so they don't like that "Abstract Factory" and "Factory Method" use the same word to mean different things -- and then you can find the other group of people whose Harley gets 40 rods to the hogshead and that's the way they likes it. I understand that with regard to this word, you're used to the extra cognitive load and don't mind.

It seems the other way around to me. I care about consistency of the meaning of words. I want the meaning of the word "Factory" to be consistent with the ordinary, English meaning of the word "Factory"; I don't want it to mean something different when we're programming from what it means in "real life". (I get equally annoyed when someone tries to insist that e.g. "story" has to mean something specific to a particular workflow: no it doesn't, it's the ordinary English word "story", if you wanted something more specific you should have coined a more specific word)

If you want to just call it personal preference that's fine, but from your "please stop calling things factories" it sounded like you were trying to get other people to change what they said.

1 reply

April 2015 ▶ Snagglepusbucket

jbrains

I understand how you feel. I'm not suggesting that everything become perfectly abstract, but I am trying to describe the benefits of making more things more abstract, because 99.9% of the designs I see out there suffer from being overly-fixated on details (too concrete), leading to duplication, copy/paste mistakes, latent bugs, unhealthy interdependencies... all diseases of too many details. Like it or not, abstraction is how we hide details and how we go from low-level understanding to high-level understanding.

Abstractions without appropriate names cause problems, so we need to take names seriously, which is why I have a system for doing just that. http://link.jbrains.ca/ovMMvz and http://link.jbrains.ca/UXOYK6

Indirection without abstraction, also knows as leaky abstraction, causes cohesion problems that directly relate to your well-founded concerns. We need to learn that abstraction is not just indirection. Details move up the call stack and patterns of behavior move down. That's why I wrote about this problem in http://link.jbrains.ca/1cIWL4u and less obviously in http://link.jbrains.ca/119E8jy

When abstractions leak, we end up with implicit dependencies: we think we've hidden details, but we haven't. I see this especially in poorly-designed attempts at introducing class hierarchies, especially especially in little mini-frameworks. That's why I wrote about my pain in trying to add code to Octopress 2. http://link.jbrains.ca/GZYYhA

I hope some of this helps.

April 2015 ▶ michael_donaghy

jbrains

I haven't done that with Wicket, but I remember going through this with Swing. I always found that I preferred moving that to an explicit "layout()" method in my custom Panels and Buttons and so on. This way I could separate layout concerns from wiring up event listeners and check those separately.

Regarding the example, you're suggesting changes that, in the space of this article, I couldn't possibly have got to yet. :) Moreover, you're postulating an invariant that isn't an invariant at all, but a mere implementation detail. Not only is "50 questions per category" not an invariant, but neither is "4 categories", let alone their names and the number of questions in them. These are all arbitrary details that might be important to the application's entry point, but that a GameEngine object or a QuestionDeck object simply wouldn't (and, as far as I'm concerned, almost certainly shouldn't) care about. When we scatter these details within our constructors, it becomes difficult to differentiate essential invariants from configuration choices. (The only invariant I can think of here is that the QuestionDeck must be a circular queue of questions, so that we never run out of questions.) If we push the details up the call stack (which starts by pushing them up into the constructor towards the clients who instantiate them), then eventually they end up collecting at the entry point, and at a glance, I can see how we put together GameEngine, GameRuleBook, QuestionDeck, Scoreboard, and other objects to make this particular game that very strangely mixes Monopoly and Trivial Pursuit. I can then reuse some of these objects in other games. That's the point.

As for Factory, I see two competing issues: (1) "Abstract Factory" and "Factory Method" using "factory" to mean different things and (2) "Factory" in these cases not matching the general meaning of "factory". I'd rather use "Factory" consistently and inexactly than exactly in one case and incorrectly in the other. That's the personal preference part. And, OF COURSE, I'm a thought leader, so I have to try to convince the result of the world to think the way I do. (For the humor-impaired: when we have a strong preference, we often seek to influence people to share our preference--this is part of human nature.)

1 reply

April 2015 ▶ jbrains

michael_donaghy

> If we push the details up the call stack (which starts by pushing them up into the constructor towards the clients who instantiate them), then eventually they end up collecting at the entry point, and at a glance, I can see how we put together GameEngine, GameRuleBook, QuestionDeck, Scoreboard, and other objects to make this particular game that very strangely mixes Monopoly and Trivial Pursuit. I can then reuse some of these objects in other games. That's the point.

I think this is the biggest problem with Java culture: a tendency to overgeneralize, to build a framework when the business only wanted an application. Often one ends up with a "configuration" that is as complex as a first-class programming language (dysfunctional release policies also play a part in encouraging this antipattern, but that's a separate rant). "Opinionated" is the current buzzword; it was "convention over configuration" before that, but the point is the same: a library should decide as many things as possible for itself, only pushing choices up to the application when they really have to change.

Something like the design of a board game is complex enough to merit a general-purpose programming language (and if we don't think Java is the best of those then why are we using it?). The whole point of an object-oriented programming language is that we don't just have dumb structs, we have polymorphic values that contain both data and logic. Rather than having a `GameRuleBook` that might be configured with some rules or other, have a `MonopolyRuleBook` and a `TrivialPursuitRuleBook` that polymorphically implement the same interface. Then we can see at a glance which type of rulebook a given value is, and we can put more of the business-level invariants into the code. If we want to reuse some code for another game, we can extract an abstract parent class and write a new class for that game. Some people talk about this style as an "internal DSL" and go on about LISP, but I see it as just making the program reflect the business it implements.

In my more extremist moments I build applications with only one configuration parameter, which tells it which of several classes that implement the configuration interface (stag, prod, everyone's development machines...) to load. Then any other configuration is a polymorphic method call on that object.

1 reply

April 2015 ▶ michael_donaghy

jbrains

> I think this is the biggest problem with Java culture: a tendency to overgeneralize, to build a framework when the business only wanted an application.

I understand. I do the exact opposite. I understand the principles to nudge the design towards just enough framework/abstraction/generalisation. Unfortunately, in order to describe that to a wide audience, I have to choose examples simple enough to illustrate the concept that many people perceive (rightly so) as far too simple to merit such refactoring. I don't know how to solve this problem, except to ask my audience to bear that in mind.

Moreover, techniques like pushing details up the call stack serve not only to generalise incrementally (as needed), but also to allow the programmer to experiment with increasing the level of abstraction without committing to some framework that they later regret. Of course, if we don't push, then we don't learn, but sometimes when we push, we design ourselves into a corner. So use version control and be prepared to undo when you notice this happening. Every programmer, so far as I can tell, learns this way, and some of the best programmers I've seen have left some insane designs in their wake.

As for your distinction between GameRuleBook and MonopolyRuleBook and TrivialPursuitRuleBook--yes, of course. So far, we only have one Game and we only know that it's called "The Trivia Game". As we extract towards an increasingly generic framework, we will certainly have a layer of generically-named Game modules/objects/classes implemented by their Trivia counterparts, possibly linked together with some Abstract Factory that ensures that we don't try to combine Trivia implementation of some interfaces with Monopoly implementations of others. (Although... that could make for an interesting game, and if we respect the contracts of the various interfaces, then we could put these things together and just see what they do.)

So at this point I can't tell whether you're critiquing me or lamenting the current state of the average Java application. What I see so far amounts to you saying "But you have to do X too" or "But you have to be careful about Y", and I do X and am careful about Y, but I can't put all that into a single article. It'd be 100,000 words long.

December 2016

sehrgut42

I loved this article. However, your sidebar about not calling static methods which return instances of the class "factory methods" stuck in my craw.

By your own logic, these _are_ factory methods. You (as in the calling code) don't know what concrete class they're returning, only that it's an instance of the class specified as the return type. It could be a private subclass your code knows nothing about. An example from "in the wild" would be the .NET HttpValueCollection subclass of NameValueCollection, only returned by certain framework methods which declare a NameValueCollection return type.

The very fact that inheritance exists means that there's no meaningful distinction present in the term you're trying to replace and the terms you're trying to replace it with. Even if, at design time, the "named constructor" always returns an instance of its containing class, I could very easily identify a subset of use cases that would be better-served by a private subclass that, say, overrides a single method. Now in that "named constructor", I switch on membership in this subset and instantiate either the parent class or the subclass.

Nothing has changed about the public interface to the class, and yet by your classification it went from not being to being a factory.

"Named constructor" or "creation method" are not types of constructor: they're types of factory.