top | item 17406379

Anti-If: The missing patterns (2016)

223 points| fagnerbrack | 7 years ago |code.joejag.com

229 comments

order
[+] jakewins|7 years ago|reply
Another pattern I've found helpful is to have branches "exit early"; if you have code like:

    if !cache.contains(key) {
      v = db.query("..")
      expiry = calcexpiry(v)
      cache.put(k, v, expiry)
      return v;
    } else {
      return cache.get(k)
    }
by the time you get to the "else", your mental stack is filled up with the cache loading logic - you have to work to recall what this else branch is triggered by.

instead, make the branch be the least complex option and exit early, and skip the else. That way, it can be read as "if this condition, do this tiny logic and exit", and that branch can then be mentally pruned.

    if cache.contains(key) {
      return cache.get(key)
    }
    
    v = db.query("..")
    expiry = calcexpiry(v)
    cache.put(k, v, expiry)
    return v;
[+] agateau|7 years ago|reply
I have heard this called early-returns. I like to write code this way too. Proponents of one-return-per-function probably write code in languages like C which does not have GC or RAII so having only one exit point makes it easy to ensure all resources allocated by the function have been released.

There are ways to do early returns in C without duplicating resource releasing code though, using gotos (it's one of the few cases where gotos make sense, IMHO). Code looks like this:

  int aFunctionWhichMightFail() {
      int result = FAIL;
      Foo* foo = allocateSomething();
      if (!someFunction(foo)) {
          goto fail;
      }
      moreWork(foo);
      result = SUCCESS;
  fail:
      releaseFoo(foo);
      return result;
  }
The Linux kernel makes heavy use of this.
[+] hinkley|7 years ago|reply
Somewhere, in the long dark ago, a bunch of people got upset about long functions with return statements peppered throughout.

So we ended up with a Best Practice of one return clause per function. Which is triply stupid because 1) long functions are part of the problem, 2) break to the end with multiple assignments to the same variable is almost as bad. It’s just trading one stupid for another which is a waste of time and energy.

And then for many years a loud minority of us, some of whom actually have a vague understanding of how the human brain processes data, insisted on the pattern you describe here. Some of us call it “book ending”. All the returns are fast or proceed to the end. Any method that can’t accomplish either has too high a complexity anyway and gets split until you get either one return, or book ends.

[another responder calls them guard clauses, but this doesn’t quite apply in all situations. A clause that handles a default or a bad input is why you can rearrange, but it misse why you should: ergonomics]

[+] tudelo|7 years ago|reply
I believe this is commonly referred to as a guard clause? I tend to prefer writing things this way too... but I could see an argument for putting something in an else block instead of just leaving it there.
[+] harimau777|7 years ago|reply
Personally, I find the first version easier to read since it makes the structure explicit. I.e. there are two blocks and you will execute one or the other.

To me the second version makes it more difficult to differentiate the structure:

  if (...) {
    foo()
    return ...
  }

  return bar()
from:

  if (...) {
    foo()
  }

  return bar()
I find that when I'm reading code I'm generally more interested in the overall structure than in the specific implementation details. Especially since if the first block was complicated enough to tax my cognitive overhead, then I would probably extract it into a function anyways.
[+] dwc|7 years ago|reply
I like and use early exits when they fit well, which is often. And in this case I'd probably do as you suggest. But it's worth mentioning that even if you stick with if/else it's often worth it to put the simple case first to reduce mental load:

    if cache.contains(key) {
      return cache.get(key)
    } else {    
      v = db.query("..")
      expiry = calcexpiry(v)
      cache.put(k, v, expiry)
      return v;
    }
[+] pasta|7 years ago|reply
This. I would amost say: Don't use 'else' (if you can).

I also would like to add that it helps to fail early.

The article shows a null check for an add function. But the question is: why not throw an exception when the input is not what you expect.

This helps to find bugs very early.

[+] sigotirandolas|7 years ago|reply
Unless it’s a one-off thing, it’s also worth considering adding a cache class. This can have the advantages of further decoupling the cache and computation logic, and being able to switch the cache implementation (e.g. to use a no-op cache for debugging). It also hides the “if” by introducing a “cache pattern”:

    return dbcache.getOrCompute(key, function() {
        v = db.query("..")
        expiry = calcexpiry(v)
        return (v, expiry)
    });
[+] yoodenvranx|7 years ago|reply
Thank for posting this! I recently had quite a few discussions about this topic with people who insist to _always_ use if/elseif/elsif/... constructs in such situations. I tried to argue that in most situations a "if (a) {return}; if (b) {return}; do_c()" construction is much easier to understand but I had only little success convincing them.
[+] sbmassey|7 years ago|reply
Personally, it is better to have functions small enough that they fit on a handful of lines, and the if-else structure makes it obvious that you are doing one or another thing, rather than having separate if structures at the top of the function that may-or-may-not exit before you reach the main body of the function.
[+] adamtulinius|7 years ago|reply
Your code should be written this way instead:

    if !cache.contains(key) {
      v = db.query("..")
      expiry = calcexpiry(v)
      cache.put(k, v, expiry)
    }
    
    return cache.get(k)
Only one return, and everything is still in the logical order.
[+] rasjani|7 years ago|reply
I’ve used Python’s decorators for similar pattern. Syntax and scope is bit different but same structure in general level.
[+] slothtrop|7 years ago|reply
Good practice, something I tend to catch only after writing something inefficient.
[+] mozumder|7 years ago|reply
These things really should be optimized by the compiler. No reason for it not to.
[+] barrkel|7 years ago|reply
Don't use boolean parameters to toggle behaviour - they're indecipherable from the calling context. But don't then mindlessly create the cross-product of methods from every combination of true and false for your boolean parameters: If the caller would have used an expression rather than a literal to determine which version of the method to call, use enums instead of booleans or multiple methods. And don't turn setEnabled into enable/disable pairs, or setVisible to show/hide pairs - these are understood as properties, not parameterized methods.

Enums vs inheritance isn't straightforward. Enums make it easy to add new kinds, while inheritance makes it easy to add new methods. They are more extensible in different directions. If it's very rare for you to add new values to your enum, and more frequent for you to switch on that enum, prefer enums (and always throw in your default clause!). If you are adding enum values all the time, and only have a couple of methods that switch on them, then prefer inheritance. Although if you're using Java, consider methods on your enum values instead.

To understand the enum / inheritance tradeoff better, examine the Expression problem: https://en.wikipedia.org/wiki/Expression_problem

[+] mikhailfranco|7 years ago|reply
Erlang provides an 'if' expression as syntactic sugar for 'case' (switch), which must always have a true (else) clause. The 'if' form is strongly deprecated.

Even the 'case' expression is mildly deprecated in favor of pattern matching and guards in function heads (definitions).

There are no null values, no classes and no inheritance, but the presence of atoms (arbitrary tokens that can be used like null, true, false or enumerations), and pattern matching (data polymorphism and destructuring), means that conditional execution is elegantly implemented by functions.

This is the second, often forgotten, meaning of functional programming. Providing functions as first-class types, supporting anonymous lambdas (closures), and applying functions over collections are given all the glory, but deprecating conditionals in favor of pattern matching over many fine-grained functions is just as significant.

[+] macintux|7 years ago|reply
Now that I'm comfortable with Erlang, writing in any language that doesn't support branching at the function heads is so depressing.
[+] TooBrokeToBeg|7 years ago|reply
> There are no null values,

Unbound variables are the equivalent (detected if used, at compile time), but a much saner way to deal with value-less variables

[+] d--b|7 years ago|reply
I disagree with the comments here. This is mostly not good advice.

Ifs exist because most real world problems require them. “If the box is checked, do this, otherwise do that”.

Masking if statements by syntactic sugar doesn’t serve any purpose in my opinion and if anything it makes the code more opaque, or worse, may force you to duplicate code...

[+] meheleventyone|7 years ago|reply
I was wondering which of the examples you think particular fall into this category of making the code more opaque?

I actually went into this with similar misgivings. A lot of "get rid of ifs" advice, particularly from OOP people ends up making things more complex IMO. However most of the advice other than the "Switch to Polymorphism" seem very down to earth and a simplification. Less about getting rid of ifs as if their mere presence was a stain on humanity and more how to rewrite what they are doing more clearly.

[+] pmontra|7 years ago|reply
It depends on the language. Languages with pattern matching usually solve the problem with code like this (pseudocode)

    fn box(status=checked)
      do domething

    fn box(status=*)
      do something else

My Elixir projects have 0 to 10 ifs and they do solve real problems. I use plenty of ifs in other languages but I follow some of the advices of the post.
[+] jacobsenscott|7 years ago|reply
Of course there are legitimate cases to use if/else.

This blog post suffers from the classic problem of blog post code examples - to fit the example in a blog post it needs to be trivial, and because it is trivial it doesn't demonstrate the real value of these techniques.

I see if/else overused all the time - every single day in fact - when the techniques in the blog post should have been applied.

[+] collyw|7 years ago|reply
My previous team leader was a pretty poor programmer and would always describe the output of SQL queries he wanted me to generate with lots of "if, then".

It was then my job to translate that into SQL which doesn't really have "if" in the same sense as an imperative language does.

Sure I could have pulled out all the data and done the branching in Python, but that would have been a performance hit, but mainly because it would be ugly, require a lot more code and likely be a lot buggier. These days I try to keep as much work done in the database because the declarative nature of SQL generally works out with a lot less bugs as well as the performance benefit.

Sure it's not going to be possible for every case, but I have noticed that the more experienced I get, the less I like "if"s.

[+] quietbritishjim|7 years ago|reply
All rules for how to code better should be considered but potentially ignored while you're coding in practice. This includes the rule in the article ("never use if", although it doesn't really say that) and the rule in your comment ("never replace if with polymorphism", although this too is an exaggeration of what you said).

If your code contains quite a lot of branching - especially the same pattern of branching in several places - then you should certainly consider replacing that with polymorphism, in the form of virtual methods or perhaps templates/generics (compile-time polymorphism). But you should also consider leaving the branching in place if that is simpler overall. That doesn't make the article bad advice IMO, you just need to not take it too seriously.

[+] majormajor|7 years ago|reply
> Ifs exist because most real world problems require them. “If the box is checked, do this, otherwise do that”.

There's a notable difference between "if the box is checked, do [thing 1], otherwise do [completely different thing 2]" and "if the box is checked, do [thing 1], otherwise do [basically thing 1, but with this additional metadata from service y]."

And that difference, I think, is what tells you where to put the `if` statements. If things are distinct enough, you have some easy-to-read top-level ifs and switch based on that. Otherwise it's more complicated to trade readability vs code reuse/deduplication.

But where you get into big trouble is when you don't think about the dangers of if's at all, and end up with call stacks 5+ methods deep with no rhyme or reason to why certain things are done in if statements in level 5 and other things are done with if statements in level 1. Creating a mental map of what's done where in what conditions for that kind of code is really hard - as is thoroughly testing it, since this often means you're passing a lot of context really deep and don't have easily separable units. (And yeah, "Switch To Polymorphism," I think, is a risky one here - that can turn into "still have the deeply nested if statements, but make them invisible.")

I don't think the article really hits that well, though.

[+] sverhagen|7 years ago|reply
Yeah, as if a line with condensed conditional logic using Boolean operators really got rid of the branching and the running it through your head.

My attempted better advice would be to keep methods short and the nesting/indentation limited. Use good method names. Make it easy to understand each method in isolation.

[+] tigershark|7 years ago|reply
I completely agree with you for the case when you say that avoiding an if brings you to code duplication. In that case from my point of view it’s a big NO. But in all the other cases, if you are using an OO language, there are always better tools than ‘if statements’. If you are using functional programming with if expressions instead of if statements then the ‘best balance’ moves a bit in favour of the ifs or a lot towards exhaustive pattern matching. On the whole I would say that I agree with the article discussed here, but with some caveats.
[+] emsy|7 years ago|reply
If you really think it doesn’t serve any purpose this means you should always use if and never the proposed constructs. This is clearly wrong and I’d recommend you rethink your position.
[+] scandox|7 years ago|reply
As per another commenter I used Elixir for 6 months a couple of years ago and very quickly started using pattern matching and guards to provide most of the If-type control flow. It didn't cover every case but it came very naturally in most cases where it was useful.

The overall logic remains the same, it's really just giving one a clearer view of a specific path.

[+] csomar|7 years ago|reply
Like anything else in life it is about balance. If you abuse ifs, then your code is unreadable, hard to debug and hard to maintain. If you abuse the alternative patterns then your code, again, becomes hard to grasp and hard to maintain.

You need to look for a balance between ifs and alternative patterns.

[+] dahart|7 years ago|reply
> Problem: Any time you see this you actually have two methods bundled into one. That boolean represents an opportunity to name a concept in your code.

I like this statement. After 30 years of coding, I've found that trying to do something quick/clever without naming it explicitly, for fear of letting the code get bigger, is what often leads me to write confusing code and harder to understand ifs.

I think maybe it's that naming in new code is easy, but adding new concept names to existing code as it grows gets a lot harder. I'm automatically naming when I create a new class, but when I fix bugs or touch existing files, I'm actively trying to avoid naming things and I'm trying to touch the least possible code, so over time it trends toward having unnamed concepts.

I usually try to apply naming to Pattern 4. So maybe instead of:

  return foo && bar || baz;
I might:

  bool bad = foo && bar;
  bool worse = baz;
  bool isHorrible = bad || worse;
  return isHorrible;
[+] Zardoz84|7 years ago|reply
Just today, I was talking with a team mate about this. We had some pice of code that it's like this :

    public boolean foo() {
      

      if (!this.doesSomethingWithA()) {
        return false;
      }
      
      if (!this.doesSomethingWithB()) {
        return false;
      }
      
      if (!this.fooBar2000.isEmpty()) {
        return false;
      }
      
      if (!this.anotherLongAttribute) {
        return false;
      }
      
      if (!this.anotherMethod()) {
        return false;
      }
      
      return true;
   }
We try to replace to a "one liner" return !A && !B && !C ... Well, the expression was very long and we noted that was more hard to read that the multiple if's. I suggested split the one liner expression on multiples lines doing something like this :

    return !this.doesSomethingWithA()
        && !this.doesSomethingWithB()
        && !this.fooBar2000.isEmpty()
        && !this.anotherLongAttribute
        && !this.anotherMethod();
However, the code formatter (eclipse), changes every time it to the hard to read one liner expression. So we ended using the multiple if's.
[+] LoSboccacc|7 years ago|reply
I suggest reading "code complete". contains a whole section about this kind of implicit behavior into self documenting code, plus a lot more. a great deal of a lot more. really a mandatory read for all developers.
[+] colechristensen|7 years ago|reply
While I can see some value, the problem I seem to be running into the most lately is over-abstracted code. Untangling the web of naming and redirections leading to small pieces of code or single variables becomes extremely frustrating. Maybe it comes from education in an engineering background.

I would much rather see

    return foo && bar || baz;
and a comment explaining behavior (although perhaps parens so I wouldn't have to reason about order of operations)
[+] jstimpfle|7 years ago|reply
Overall, very good advice. I disagree on polymorphism through virtual methods being a solution to the maintainability problem switches pose, though.

With such polymorphism it's difficult to know the things that can happen. Implementations of the virtual method are not closed - they could come from anywhere in the codebase.

Another problem is that it's hard to cleanly separate the common code from the specialized code in each implementation class. That's why one ends up with ugly helper functions that receive a lot of state, or alternatively with unmaintainable class hierarchies.

IME the only solution is to minimize usage of datastructures that have choices in them ("ADTs") and minimize the coupling of the code to the choices. The latter is done by separating code paths early, and by only switching if absolutely necessary.

[+] JackMorgan|7 years ago|reply
I disagree that avoiding ADTs is the solution, it only exacerbates the issue further. When you only have flat records, they often have a lot of data that is or isn't set based on the state of the record. For example, a Workflow data structure that doesn't have an ErrorMessage field set when there's been no error. This could be a data structure with an FailedWorkflow and SuccessfulWorkflow, and the ErrorMessage only on the FailedWorkflow.

Helper functions that receive a lot of state aren't ugly, they are pure. Pure code is easily tested. It is easy to reason about and reuse, because it's clear what the expected inputs and outputs are.

http://deliberate-software.com/anemic-domain-model/

[+] noelwelsh|7 years ago|reply
It's important to take this kind of advice in it's context, which I would hand-wavingly describe as "imperative old-fashioned OO".

For example, the advice "Pattern 2: Switch to Polymorphism" definitely does not hold in languages with algebraic data types, where the compiler will check that your "switch" expressions check all possible cases. Given that at least Rust, Swift, Scala, and Typescript can all do this, this kind of advice is becoming increasingly outdated.

[+] ariehkovler|7 years ago|reply
Fascinating. I love constrained writing and constrained poetry.

No If-statements in code reminds me a bit of the E-Prime movement (https://en.wikipedia.org/wiki/E-Prime) which was about removing the verb 'to be' and all its variants like 'is', 'was' etc from English language. Supposedly it made English easier to understand and encouraged clarity and precision.

[+] wst_|7 years ago|reply
Although I generally agree with the concept the truth is it's not easy to stop using `if` or `switch` completely. By applying rules described in the post you are just pushing the problem around. At some point, somehow, a decision must be made which branch to follow. Want to create two methods instead of `if` fine, do it - how are you going to decide which of them to call? You want to use polymorphism, fine, do it - how are you going to decide which class to instantiate? What to use enums with methods on them, fine, how are you going to decide which enum to use? You may say "Oh, I am just deserializing user's input to enum. I don't have to decide." Fine, so you've just pushed decision onto users of your application. Sure, you may use some other methods, especially when you must choose from multiple options (map instead of switch, etc.) but you should always remember - just by avoiding decision making in one place, doesn't mean you've completely eliminated it from business process.
[+] MattBearman|7 years ago|reply
Interesting article and definitely some good advice, but I completely disagree with example 4.

Surely I can't be the only person who finds

    if (foo) {
        if (bar) {
            return true;
        }
    }
    
    if (baz) {
        return true;
    } else {
        return false;
    }
Much easier to parse than

    return foo && bar || baz;
It's a bit better to my eye with parentheses:

    return (foo && bar) || baz;
Of course the first version is far from perfect (no need for the nested if, etc.) but I've always found too many comparisons on one line just looks messy.
[+] d--b|7 years ago|reply
Pattern 1 is impracticable: most reasons why you would add a Boolean parameter to a function is to avoid code duplication:

void blah(int x, bool logInfo) {

   ...

   if (logInfo) ...

   ...
}

Pattern 2 can be horrible when applied to the wrong problem. As in: you can use polymorphism to implement fact(int x) without if statements, but it’s not going to be pretty...

Pattern 3 is not a pattern, it just says “if the if is useless remove it”...

Pattern 4 is particularly confusing, especially when you start to have nots in there:

var x = !(a || !b) // ugh

Pattern 5: doesn’t remove the if...

[+] r3n|7 years ago|reply
I would suggest one use these as a sign of "there is another way to write these code", rather than follow it without knowing why you need it.
[+] amarant|7 years ago|reply
am I reading this right? it seems number 3 is suggesting I accept Optional as function/method parameters? That is an anti-pattern if ever there was one, and it also adds another layer of if's instead of removing it. an empty collection is fine, but if I accept an optional, I need to first check that the optional isn't null (because it can be!) and if its not, check if the optional has a value(basically another nullcheck, jay!). optionals should only ever be used as a return-type. I hear a lot of people say that they should be used as parameters to indicate that one parameter is not necessary, but that's what overloads are for!
[+] KirinDave|7 years ago|reply
It's very weird to see people genuinely suggest, "Conditional logic is complicated, so try burying it in inheritance."
[+] jmartrican|7 years ago|reply
IF statements increase cost of unit tests, is what I have found. By enforcing tight code coverage standards you start seeing IF statements disappear from code as developers do not want to write a test case for each of those IF statements.

With tools like JaCoCo we track number of branches and cyclomatic-complexity in our code (which is a measure of IF statements in most cases, try/catch being the other big source of branches). We try to keep that number down to 1 as much as possible. Once was able to take a method with over 160 branches down to 1 by eliminating every IF statement. The unit test was then very trivial and only had one test.

[+] hellofunk|7 years ago|reply
> Moderation in all things, especially moderation

Brilliant!

> being defensive inside your codebase probably means the code that you are writing is offensive. Don’t write offensive code.

There are some catchy insights here.

[+] btilly|7 years ago|reply
Looking at the code in the examples, the first style thing that I would suggest fixing is that the name of a boolean variable should be a yes/no question which the value is an answer to. So don't name the boolean "temporary". Name it "isTemporary" instead.

Moving on to the actual point, for a balanced view just look at cyclotomic complexity. In a function, count the decision points. Every if and loop counts. Any function with 10+ decision points is too complex and needs to be refactored to make the control flow more obvious.

Following this rule will let you use if quite happily, while also avoiding the problems with it.

(A side tip. If your language has some variant of "unless", don't use it. Ever. Our brains do not do de Morgan's laws naturally. So you can write code that is clearer upon writing, but is much harder to keep straight when you're debugging. The time that you need to understand your code best is while debugging, so ease of debugging is more important.)

[+] dep_b|7 years ago|reply
It could be renamed "Things that suck about Java" and you wouldn't need to remove much. It's true that if-statements are something you might want to avoid in any language but Java doesn't give you as many ways to do it as I would like to see. The first part about the boolean is pretty universal for most languages.

For example the switch on type: wouldn't it be nice you could simply have a compiler error whenever you forgot to either add default or added all values? I use enums a ton in other languages that do and it's like a to-do list of code to implement if I would add for example another specie. Java just doesn't so the only compile-time safe way to have it is to force subclassing.

Equally the argument about not checking nulls inside your own application. Nullable pointers are a language problem. The argument to that function should not be nullable to begin with. It doesn't make sense to call it with null. Ever.

I find foo && bar || baz is pretty hard to parse as well. It might be easier to throw all three arguments in a switch / case statement and give all relevant states their own case. But that doesn't work in Java.

The hidden "null means something" example could easily be rewritten to a response that simply has a Error or Result state that either contain the error or the result we wanted to have. Java only allows you to return an object that you need to cast to use (Error extends BaseResult, Result extends BaseResult) or declare it might throw a particular type of error.

I think all if not most of these problems are solved in Kotlin. Java is just not a great language to express problems in a short and succinct way without introducing side effects or massive amounts of boilerplate (sub sub sub sub classing).

more experienced in Swift but Kotlin devs can follow the patterns I use

[+] sverhagen|7 years ago|reply
Let's eradicate nulls first, then do the ifs in your own time ;)