top | item 11181742

GCC 6: -Wmisleading-indentation vs. “goto fail;”

199 points| dmalcolm | 10 years ago |developerblog.redhat.com | reply

165 comments

order
[+] strommen|10 years ago|reply
Making braces optional in single-statement if/else/while/for clauses is one of the biggest anti-features in C. It's frustrating that it was ported forward to more modern languages like Java, JavaScript, C#, etc.

I'm glad Python (with semantic whitespace) and Go (with gofmt) solve this problem.

[+] ajross|10 years ago|reply
> one of the biggest anti-features in C

Please. I'm as big a C apologist as you'll find and even I can think of six or nine thing objectively worse about the language.

This issue is merely a great bike shed because it allows all of us rabble to join in on a discussion about "compiler" technology.

Look: it's a good warning. People should clearly use it. It probably should have been written long ago, and probably would have if our editors hadn't been enforcing this since time immemorial.

[+] ktRolster|10 years ago|reply

  >I'm glad Python (with semantic whitespace)... solve this problem.
Python solved it....then added the problem of making things you can't even see semantically significant. If you write Python, it's helpful to have an editor that makes the difference between tabs and spaces visible....
[+] deweerdt|10 years ago|reply
I think that gofmt is particularly innovative, in the sense that it acknowledges that formatting is integral part of the language.

In the sense that a programming language is not only made to be parsed by a computer, but also read back by a human.

[+] pavlov|10 years ago|reply
Single-line ifs are pretty useful with traditional-style C libraries that expect all checks to be done at call site:

  if (ptr) call_oldschool_thingy(ptr);
[+] pklausler|10 years ago|reply
Haskell's solution to this problem is also excellent and needs to be better known and copied into newer languages. The language has curly braces around the bodies of 'let', 'case', 'where', and 'do' constructs, and also the bodies of modules, and semicolons between the constituents of those constructs -- but when the compiler demands a '{' and doesn't see one, it uses indentation to insert them in the obvious places. If your indentation is incorrect, you're likely to get an implicit ';' or '}' put in the wrong place, and you go fix it.

The scheme works so well that people will often go months into their Haskell education before they encounter code with explicit braces and semicolons.

[+] GTP|10 years ago|reply
I don't like Python forcing you to use a particular identation. And in my opinion optional braces are not an anti-feature, "with great powers come great responsibilities": like any other tool it's up to who is using it to use it in the right way.
[+] ilek|10 years ago|reply
I don't find something like the below frustrating really

if (a)

    ...
else

    ...
But for some reason I find the below _very_ frustrating. It feels misleading and I find it to be very, very ugly.

if (a)

    ...
else {

    /* Multi line block */

}
[+] moron4hire|10 years ago|reply
I agree. I've trained myself to religiously apply curly braces to all blocks, but it'd be more reassuring to know there was a compiler check for it. I do have linters set up to check for it, but I just wish it were default.

This is also one of the reasons I like Racket: there's no such thing as ambiguous block delimitation. Also, everything-is-an-expression is very useful.

[+] wdr1|10 years ago|reply
> Making braces optional in single-statement if/else/while/for clauses is one of the biggest anti-features in C.

What's frustrating is that K&R still has a lot of example code like that as well. It would be wonderful to see a 3rd edition even if it do nothing but change its code to use braces.

[+] andybak|10 years ago|reply
The argument for significant white space in Python goes as follows:

You need indentation for humans to understand the structure. Why do you also need braces for the parser to understand the structure when the parser can use the same information that your eyes use? You therefore avoid the possibility of the two signals contradicting each other.

[+] RHSeeger|10 years ago|reply
Because, by having both, you can have your editor auto-indent and then you get a visual indication of where implementation does not equal intention.
[+] arohner|10 years ago|reply
Allowing explicit braces fixes deficiencies in python's grammar. For example, multi-line lambdas aren't currently possible in python.
[+] return0|10 years ago|reply
Agreed, but with braces, i can press % over a '{' in vi and find the matching }
[+] gravypod|10 years ago|reply
This seems like it could have been avoided by people using braces around every block.

Omitting braces in this case leads to a lot of problems.

[+] iainmerrick|10 years ago|reply
This warning is a practical solution to a real problem (of which the famous "goto fail" is an example).

Using braces everywhere is definitely good style, but it isn't a practical solution because there's a ton of existing C/C++ code that doesn't use braces.

[+] AdmiralAsshat|10 years ago|reply
I'm in favor of braces around everything, but to be honest even I will occasionally use the indent if I only expect one action. e.g.

  if (true)
    foo();
  else
    bar();
This looks prettier to my eyes than

  if(true){
    foo();
  } else {
    bar();
  }
However, I might not be the last person to touch the code. My coworker might come later and add:

  if (true)
    foo();   
  else
    bar();
    baz();
And hence the indent problem.
[+] MichaelBurge|10 years ago|reply
Suppose Debian decides to hire you as their new release maintainer. They want to ensure that bugs such as the OpenSSL one mentioned in the article never happen again. What actions would you take before next month's release to attempt to catch it?

I see from your comment that you might hire a team of developers to go through a critical path of important C/C++ packages checking for style violations. Any code found missing braces will have a patch submitted upstream to correct the errant style. Any package that declines the style changes would be removed from Debian. Any developer you hire that misses style violations will be removed from the team.

Another approach might be: Turn on the warning mentioned in the article and manually review any cases that come up. That still might generate a lot of cases, but it at least seems possible.

Is there another actionable interpretation of your comment that I'm missing?

[+] epx|10 years ago|reply
This is the only item that I disagree with the Linux coding style ("Do not unnecessarily use braces where a single statement will do.")
[+] pkaye|10 years ago|reply
I remember doing this mistake back in college. Was required to program and demo the B-Tree backend for a mini database. Things were very well tested for tons of random sets of data but while waiting for the my turn to demo, added one line of debug print on an "if" statement which caused the now dangling statement to be always executed. This caused a subtle bug that the professor caught in his testing. Sucked because I had everything correct up to that point. Learned my lesson about last might changes and coding style at that point.
[+] kllrnohj|10 years ago|reply
People will make mistakes. Policy does not prevent that. Tools do.
[+] ConceptJunkie|10 years ago|reply
I've done this for as many years as I can remember. Unfortunately, almost no one else I work with does.
[+] lmm|10 years ago|reply
Extra braces add visual clutter and reduce readability, especially when they mean a function no longer fits on one screen. So if there's a way to eliminate that kind of bug without having to add more braces then I'm all in favour of it.

(Personally my preference would be a linter that automatically runs on checkin, and refuses commits that do not conform to the style guide)

[+] oftenwrong|10 years ago|reply
Semi-related: What do people think about if-else-if... chains vs nested simple if-else blocks? I have seen many cases on the job where someone writes a complex if-else-if chain and then an oversight in their logic WRT the dependencies between conditions causes the wrong branch to be taken. I prefer the latter style of the ones I've put below, especially when the conditions are more complex. For me, it makes it easier to mentally picture the control flow.

    if(condition_a && condition_b){
      do_thing_a();
    } else if(condition_b){
      do_thing_b();
    } else {
      do_something_else();
    }
vs

    if(condition_a){
      if(condition_b){
        do_thing_a();
      } else {
        do_thing_b();
      }
    } else {
      do_something_else();
    }
Furthermore, I prefer functional languages where if-then-else is an expression with a mandatory else (or doing control flow via pattern matching with enforced exhaustiveness like you can get with GHC). I don't like surprises.
[+] wting|10 years ago|reply
Kinda late but I want to point out these snippets have different behaviors:

    if A and B:
        ..
    elif B:
        ..
vs

    if A and B:
       ..
    elif A and not B:
       ..
I prefer flattening because enumeration makes it obvious what's different between the branches despite the code duplication.

For example, three boolean conditions has 2^3 = 8 possible combinations. When nested this complexity is hidden but obviously apparent as a code smell when flattened.

It also echoes Python's ethos that flat is better than nested.

[+] jahewson|10 years ago|reply
I don't think there's a general statement to be made about this. I have no problem reading both examples and actually prefer the first. But it depends on the what the specific logic is - some conditions are easy to understand, despite their size, others are small but subtly complex. I certainly wouldn't want to mandate writing code like your second example without considering the use case first - YMMV.
[+] dcvuob|10 years ago|reply
Second example is very readable using Allman style:

    if( condition_a )
    {
         if( condition_b )
         {
             do_thing_a();
         } 
         else 
         {
             do_thing_b();
         }
    } 
    else 
    {
         do_something_else();
    }
Editor space is free, why not use it.
[+] Coding_Cat|10 years ago|reply
Thinking about it, I'd say the second one is better. I think it parse better in my mind as to what is meant.

However I also always comment nested logic chains with their intent in plain language. I have made bugs in going from intent->complex logic to often, and I found that writing comments at each fork greatly reduces these errors as well as the odds of missing an edge case.

e.g., I'd write "if(a) and not b" in plain English at the nested else statement, possibly followed by a "what & why" comment at the do_thing_b statement.

[+] iainmerrick|10 years ago|reply
What a great idea. It's hard to believe no-one ever did this before! (Or did they?)
[+] Piskvorrr|10 years ago|reply
Well, some IDEs will warn you even during editing - a step ahead of compilation (e.g. IDEA and its kin). The earlier this antipattern is caught, the easier it is to fix.
[+] Ace17|10 years ago|reply
Why not give to your developers an immediate way to reformat the whole source tree before each commit/pullrequest, using dedicated tools, like uncrustify, bcpp or AStyle? Just store the configuration file in the repository, hook the reformat pass to the beginning of your build, and you're done.

We've been doing this at work for several years ; and we found that this solved nearly every style-related issue (diffs, arguments over which code is 'prettier', artificial merge conflicts). It turns out the style becomes a lot less important issue once you can rely on a tool to apply it for you. And it also solves the misleading indentation issue (probably by preventing it to happen in the first place by causing a merge conflict).

[+] humanrebar|10 years ago|reply
In theory that works fine, but in practice, I find every tool leaves little bits of cruft laying around:

// comments that get formatted correctly but then // the // wrapping // isn't // merged // into // one // line

On the other hand

// this could // be a // tabular comment

So you'd need everyone to use the same tool and you'd still need to go back and fix things periodically. To be fair, it might still be a net win.

[+] 21|10 years ago|reply
I wonder if it would be feasible to add a -allow-python-style option, which would allow the use of some Python syntax in C++, like significant whitespace or `if a:` instead of `if (a)`

Cython is already something sort of like this.

[+] pif|10 years ago|reply
> it’s been finding real world bugs

One more case to support -Werror.

[+] GFK_of_xmaspast|10 years ago|reply
-Werror holds things back, because it makes the gcc maintainers hesitant to add more warnings on grounds of "it will break old code that uses -Werror", and in fact I was surprised to see that this warning is going into -Wall.
[+] return0|10 years ago|reply
This is so useful, i wonder why it wasnt there before. Especially when copy-pasting code with different indentation levels / tabs etc.
[+] noamsml|10 years ago|reply
Nice. These sorts of warnings are why it's worth investing the extra effort to enable -Werror in your codebase if you can.
[+] tbirdz|10 years ago|reply
The problem with -Werror in open source projects is that it can produce errors for users building your code on different compilers or compiler versions than the one you originally developed it for.

As compilers change they can end up adding new warnings to the defaults, or for or example -Wall, or -Weverything. And different compilers might throw different warnings on the same code with the same options involved. Using -Werror will force an error, which can end up terminating the build for users, even when the code compiles cleanly on your version of your compiler.

So -Werror is fine if you are shipping binaries, but if you are shipping source code, it's probably a good idea to not use it in the public build system.

[+] marcosdumay|10 years ago|reply
No, I've never seen a real use case for -Werror.

What's the difference if compilation stops at the warning? You'll (or your team, or whoever) fix it anyway, and if you won't, you have a people problem, that must be solved at the policy (or HR) level.

Solving people problems at the tooling level is a certain way to get unintended consequences and alienate the good people that weren't part of the problem. Of course, you can get some tooling to support your people, but tools to police them are worth less than zero.

Anyway, unrelated to that, I do like to live warnings on code that is not ready to production. It's an easy (effortless in fact) way to make sure it'll be fixed before deploying.

[+] swehner|10 years ago|reply
You'd think a stand-alone program could take care of this quite nicely? No need to clutter the compiler itself.
[+] TorKlingberg|10 years ago|reply
I have set up stand-alone static analyzers a couple of times, and it can be a giant hassle to get them working correctly. It is especially bad with complicated build systems for embedded applications.

To parse a C file correctly a tools needs to know the exact set of include path and defines you passed to the compiler. Then it needs to go read the whole tree of header files and preprocess everything. It needs to know where your standard header files are (different from the system header file when cross compiling). It needs to know about your compiler's built-in defines. It may also choke on any C extensions that your code (or any header file) is using.

In this particular case it may be enough to parse the file with some regexes but I wouldn't trust it; people do some crazy things with C macros.

[+] GFK_of_xmaspast|10 years ago|reply
It's hard enough to get people to turn on warnings in the first place.
[+] splicer|10 years ago|reply
Does GCC or clang have a warning for the following?

  if (foo);
  {
      bar();
  }
[+] sesutton|10 years ago|reply
GCC warns with -Wextra (or -Wempty-body) and clang warns by default.
[+] anon4|10 years ago|reply
Interesting. I would have assumed that this kind of check should be done by static analysers.
[+] jemfinch|10 years ago|reply
Compiler warnings are static analyzers.