Early exit is a tail call optimization of procedural languages

[+] jpollock|10 years ago|reply

I've been in the "only a single return" and "early return plz" camps at various points in my career.

Right now? I'm early return.

When I wrote C and C++ code with locks and malloc/free? One exit only.

It's all about avoiding bugs.

In C code heavy with locks and memory allocation, you need to make sure your post-conditions are correct. That is near impossible when there is more than one return [1].

In C++, with RAII and Java with "try-with-resources", I can ensure proper cleanup. At that point, it's about making it easy for someone who is reading the code to avoid screwing it up. I need to make it easy for them to flow through the code, keeping the cognitive load as low as possible. Functions do what they say on the box and _only_ what they say on the box. At that point, my goal is to make a decision, act on it and move on. I don't want to carry that state forward - it means I have to keep it in mind for every line after it. I'm even trending towards having the majority of the methods in my classes be static functions with no state.

Finally, if I've got a function that has two different flows and I'm deciding between them with an if? I've probably actually got two different functions pretending to be one. I then try to refactor and get rid of the if.

[1] http://programmers.stackexchange.com/questions/154974/is-thi...

[+] legulere|10 years ago|reply

But in C you usually use `goto Error;` which basically is just a `return` + you can do the cleanup yourself.

[+] Stratoscope|10 years ago|reply

The error checking example conflates two different things:

• Early returns vs. single return.

• Excessive nesting vs. simpler flatter if/else.

The real problem with the "before" example is the excessive nesting. I do prefer early returns myself, but even if you wanted to use the single-return style, you could simply un-nest the whole thing:

    public string Execute( int integer, string str, bool boolean ) {
        string result;
     
        if( integer <= 42 ) {
            result = "Error: Integer is too small";
        } else if( string.IsNullOrWhiteSpace(str) ) {
            result = "Error: String is null or empty";
        } else if( ! boolean ) {
            result = "Error: Incorrect boolean";
        } else {
            result = "Success";
        }
     
        return result;
    }

It's a bit ironic that this is C# code, because ReSharper would offer to un-nest the code for you.

Also, this really has nothing to do with tail call optimization.

[+] barrkel|10 years ago|reply

Frequently the checking is dependent on passing the previous clause. Something like:

    if (x == null)
        return "x is required";
    if (x.length < 3)
        return "too few elements in x";
    if (x[0] == null)
        return "first element missing";

The point here being that each successive check is testing something that is only valid if the previous check passed. This can't be so easily converted to your if/else pattern without re-encoding the early exit in the form of a complex boolean (&& and || have early exit built in).

The concept's applicability to tail call optimization is only by analogy with the mental stack; just as tail call optimization sets up the call such that the current stack frame no longer exists, early exit sets you up to remainder the rest of the code without regard to any potential alternate code paths (i.e. a potential else branch on an early if statement) later; the mental frame of a long if statement no longer exists. The analogy is a little bit stretched; of course it is not a technical isomorphism, but nobody ever said it was.

[+] tikhonj|10 years ago|reply

Return statements and "early exit" are not built into functional languages. If the language, like Scheme, supports continuations, we can use callCC to implement an early return[1]:

    (define (example x)
      (call/cc (lambda (return)
        (when (< x 0) (return #f))
        ; more code, including possible more calls to return
        0)))

However, most functional languages do not have callCC built in. Happily we can get a similar effect by writing our code in continuation-passing style (CPS)[2]. (Think Node.js.)

Nominally, CPS has some overhead because of the extra function calls involved. However, a CPS transform also puts every call in tail position, making them eligible for tail call elimination. This means that we can use CPS with no function call overhead in a language with proper tail calls.

What does this mean in practice? It means that calling a callback in CPS style—which, being in tail position, can be eliminated—is effectively just a jump under the hood. Not too different from return!

Putting it all together, we get that tail call elimination lets us write code in continuation passing style with no extra over head which gives us access to continuations that we can use to implement an early return in languages without return.

This turns out to not be what the article was on about, but I think it's more interesting.

[1]: Code from a StackOverflow answer by Nathan Shively-Sanders which goes into more detail: http://stackoverflow.com/questions/2434294/scheme-early-shor...

[2]: I wrote a brief explanation of CPS on Quora: https://www.quora.com/What-is-continuation-passing-style-in-...

[+] ridiculous_fish|10 years ago|reply

Is it possible to use CPS with functional primitives like map? Or do you have to implement CPS-savvy replacements? It doesn't seem like a CPS transform would let you early-exit from map in tail position.

[+] fantasticsid|10 years ago|reply

I would rather see more articles like this on hacker news!

[+] dang|10 years ago|reply

Anyone who wants to share their preference with us is welcome to email [email protected]. We're interested.

[+] DanBC|10 years ago|reply

How often do you visit /newest to upvote the content you want to see?

[+] acchow|10 years ago|reply

Was like this a few years ago.

[+] jkot|10 years ago|reply

Original code seems like really bad way to check validity of parameters. Is this type of programming common in some languages? I have never seen similar code in Java.

[+] jpollock|10 years ago|reply

Some groups have a "Functions can only have a single entry and single exit point" rule, meaning you have to jump through a lot of hoops.

Sometimes it is required. In Java, before try-with-resources, if you had any resource management (locks, file descriptors, sockets), a single exit point made resource management simpler. C code tended to have similar rules (or GOTO's to jump to the exit block in the function). C++ just used RAII, and made it the destructor's problem.

[+] bjacks|10 years ago|reply

I think that if the author re-wrote the first method in a cleaner way, composed of lots of smaller methods with descriptive names then the code would be a lot more readable, and you wouldn't even need to debate the whole multiple returns versus single point of return thing because your "mental stack" could handle either option equally well.

For example:

  public string Execute(int integer, string str, bool boolean) {
    string result;

    if (isValidInteger) {
       result = validateStringAndBool(str, boolean);
    } else {
       result = “Error: Integer is too small”;
    }
   return result;

}

  private string validateStringAndBool(string str, bool boolean) {
     if (!string.IsNullOrWhiteSpace(str)) {
       return isValidBoolean(boolean)          
     } else {
        return  “Error: String is null or empty”;
     }
  }
 
  private string isValidBoolean(bool boolean) {
     if (boolean) {
        return “Success”;
     } else {
        return “Error: Incorrect Boolean”;
     }
  }

And yeah, I can't see how this has anything to do with tail call optimization other than a fluffy analogy to stacks - mental and programmatic.

[+] hellofunk|10 years ago|reply

Would be nice to see more serious study of this for a language like C++. It would take either TCO or Garbage Collection to make reliable, robust persistent data structures a possibility in C++, and more would prefer TCO of these two. That would open a big door into more powerful functional programming idioms for C++ than currently exist, even with the latest language standards.

[+] adrianm|10 years ago|reply

How would TCO help? I've never read anything about the connection between TCO and persistent data structures. GC on the other hand is a known, and often assumed, requirement in the literature. Sounds interesting!

[+] pjc50|10 years ago|reply

Compilers already implement some TCO. What specifically do you mean about "reliable, robust persistent data structures" that you think is currently not possible in C++?

[+] glastra|10 years ago|reply

To me, this only makes sense if you always read source code sequentially. Humans are not machines, and most often you will not be reading code (especially code with branches) in a top-bottom manner.

Having multiple return statements in a method actually renders the method LESS readable. I understand the first proposal is also usually implemented with many return statements, but they could be replaced with a variable assignment, variable that is then returned at the end of the method (as in the example). If that variable is final (or whatever the C# equivalent is), that's even better.

[+] Bognar|10 years ago|reply

> Having multiple return statements in a method actually renders the method LESS readable.

I 100% disagree. Introducing a variable instead of using multiple returns means I now have to keep track of yet another variable throughout the entire rest of the method. Is it modified before it's returned? Gotta read the whole thing to find out.

Early returns reduce the complexity of a method by reducing the number of possible states as you move through the method.

[+] adamc|10 years ago|reply

I don't think I fully understand your comment. Regardless of whether I've read the whole program top-down, it is definitely easier for me to understand early-exit logic in a routine than to keep the mental stack of conditionals going, particularly when (as is often the case) the early exits consist of various deviations from the "main sequence" code (e.g., validations failing or other edge-cases).

[+] cballard|10 years ago|reply

Swift solves this with "guard" statements, which must return if their condition fails. This is most commonly used in combination with converting optional values to non-optional values, to prevent pyramids of doom.

    guard let something = maybeSomething else { return ... }

[+] SideburnsOfDoom|10 years ago|reply

> Having multiple return statements in a method actually renders the method LESS readable

Does it? I don't find it that. However, if you have never read code of this style before, it will be unfamiliar to you and you will find it less readable. But you can overcome this with practice.

I would rather say that it depends - both are valid styles, sometimes it's more readable to use multiple return statements and sometime less. As a programmer you know both and pick the option that is better suited to the problem at hand.

[+] btilly|10 years ago|reply

I have seen this claimed many times, and I've never seen any research for it that in the end came down to more than personal prejudice.

On the other hand http://www-cs.stanford.edu/people/eroberts/papers/SIGCSE-199... offers concrete reasons why early returns make sense.

49 comments