Linux kernel coding style

kagia|11 years ago

I doubt everyone agrees to that coding style, I certainly don't. However when submitting code to a project I'd still stick to the prescribed coding style, because I believe consistency in any code base can be just as important as any other measure of readability.

snlacks|11 years ago

I find it hard to agree with a lot of this, but it'd obviously be someone who's written a lot of code, thought a lot about how to write code, and reads a lot of code - even if we didn't know who it was. There's a lot to learn from reading stuff like this, if you take it all with a grain of salt... or if you're contributing.

notacoward|11 years ago

Mostly good advice, sometimes even great, but the part about typedefs is total BS. Any non-trivial program will use values that have clearly different meanings but end up being the same C integer type. One's an index, one's a length, one's a repeat count, one's an enumerated value ("enum" was added to the language to support this very usage), and so on. It's stupid that C compilers don't distinguish between any two types that are the same width and signedness; why compound that stupidity? Both humans and static analyzers could tell the difference if you used typedefs, and avoid quite a few bugs as a result. Being able to change one type easily might also make some future maintainer's life much better. There's practically no downside except for having to look up the type in some situations (to see what printf format specifier to use), but that's a trivial problem compared to those that can result from not using a typedef.

Don't want to use typedefs? I think that's a missed opportunity, but OK. Don't use them. OTOH, anyone who tries to pretend that the bad outweighs the good, or discourage others from using them, is an ass. Such advice for the kernel is even hypocritical, when that code uses size_t and off_t and many others quite liberally.

cesarb|11 years ago

Most of that section is concerned with hiding structs or pointers as typedefs: "In general, a pointer, or a struct that has elements that can reasonably be directly accessed should _never_ be a typedef."

Say you are reading a function, and see a local variable declared: "something_t variable_name;". Is it a struct, a pointer, or a basic type? Now compare with "struct something * variable_name;", which is clearly a pointer. If on the other hand it is "struct something variable_name;", you know that it's a struct allocated on the very small kernel stack (less than 8KiB per thread) - something which wouldn't be as clear if the fact that it's a struct were hidden by a typedef.

There are three main reasons to use typedefs: to allow for changes to the underlying type; to add new information to the underlying type (which is item (c) in that section); and to hide information. Since the Linux kernel runs in a constrained environment (as I mentioned, the kernel stack is severely limited, among other things), hiding information without a good reason is frowned upon. It's the same reason they use C instead of C++; the C++ language idioms hide more information.

> Both humans and static analyzers could tell the difference if you used typedefs, and avoid quite a few bugs as a result.

The Linux kernel does that! As I mentioned, it's item (c): "when you use sparse to literally create a _new_ type for type-checking." See for instance the declaration of gfp_t:

  typedef unsigned __bitwise__ gfp_t;

The __bitwise__ is for the Sparse static checker. There are other similar typedefs, like __le32 which holds a little-endian value; the Sparse checker will warn you if used incorrectly (without converting to "native" endian).

DSMan195276|11 years ago

You're not understanding their idea behind typedef. IMO, their lines for usage are extremely good when adhered too.

The note about integers is worth complaining a bit about, I agree there is merit to typedef'ing integers in some situations, but the Kernel standard agrees with that in those instances (And the example is bad, there are instances of 'flag' typedefs in the kernel). In general the note about integers is just to discourage spamming typedef's everywhere.

More importantly then integers though, their note about making opaque objects with a typedef is extremely good practice, as it makes it easy to distinguish when it is or isn't expected that you'll be accessing the members directly.

The point of those rules are to allow typedef to actually be useful and communicate some information. If you just allow typedef'ing everything in every situation, then whether or not something is typedef'd becomes useless information to the reader.

raverbashing|11 years ago

"Such advice for the kernel is even hypocritical, when that code uses size_t and off_t and many others quite liberally"

Did you even read their explanation. Apparently not.

This is an acceptable use of typedefs, as explained there, exactly because a size_t varies between architectures.

dllthomas|11 years ago

I've had luck using single-element structs to distinguish between types of data when I'm throwing a lot of primitive types around. In my test with gcc, the generated code was identical to using the primitives directly, although the standard doesn't actually guarantee that and it's historically not been the case in some particular compilers (not sure which).

bluecalm|11 years ago

As a recreational C programmer I have the same impression. I've always used typedefs for structs and enums in my code and I think it makes it more readable and easier to work on. My reaction to reading the kernel style guidelines was a surprise and I am happy I am not the only one disagreeing.

maguirre|11 years ago

I agree with you. I think structs help readability specially we using function pointers within structures. Would it be out of line to suggest a new naming convention for struct typedefs and pointer typedefs i.e _t for typedegs and _tp for typedeg pointers

robinhoodexe|11 years ago

"First off, I'd suggest printing out a copy of the GNU coding standards, and NOT read it. Burn them, it's a great symbolic gesture."

Shots fired.

wsc981|11 years ago

I thought it was a funny read (and I certainly don't agree with everything, even if some snark remarks gave me a good laugh). I wonder if the style guide has been written by Linus, which would make sense to me.

bryal|11 years ago

With the target being the, what?, 5 people?, that enjoy following the GNU coding standards.

patrickg|11 years ago

I am glad that for Go there is `go fmt` which predefines some of the issues mentioned in the article. Thus there is "one global coding style for Go". It's another matter if one likes it or not.

Dewie|11 years ago

I don't see why there couldn't be a `kernel fmt` tool. In this day and age, we should really be beyond having to worry about things like hmm, what was the brace style in this project again, and should all if/while/for have mandatory braces?.

jackalope|11 years ago

"Get a decent editor and don't leave whitespace at the end of lines."

Trailing whitespace always raises a huge red flag for me whenever I look at someone's code. It's not just sloppy, it often makes diff output so noisy you can't detect real changes to the code.

Perseids|11 years ago

I understand that it would be bad to introduce whitespace-only changes, but why would whitespace at the end of the line that doesn't break the 80 character limit be a problem otherwise? Sure, git colors them red in its diffs, but that is kind of a circular reasoning.

E.g. in this diff

  0a1
  > k=2 
  2c3,4
  <     print(i)
  ---
  >     k*=k 
  >     print(k)

you don't even see the spaces after "k=2" and "k*=k".

teacup50|11 years ago

Who cares? Get a decent editor that doesn't give a crap if there's invisible whitespace at the end of a line.

JBiserkov|11 years ago

>Encoding the type of a function into the name (so-called Hungarian notation) is brain damaged - the compiler knows the types anyway and can check those, and it only confuses the programmer. No wonder MicroSoft makes buggy programs.

"Making Wrong Code Look Wrong" by Joel Spolsky is a must-read and contains an explanation of Apps Hungarian (the original, thoughtful one) vs Systems Hungarian http://www.joelonsoftware.com/articles/Wrong.html

humanrebar|11 years ago

C is not a strongly typed language and it does not allow function overloading. C projects should allow for some flexibility in naming notations to make up for those language design decisions.

Also, any project that uses int return codes shouldn't be leaning too heavily on type safety.

cbd1984|11 years ago

Right. Use Hungarian to encode information the type system can't represent. This is relevant even in object oriented languages, such as distinguishing between escaped and unescaped strings.

kakakiki|11 years ago

"There are heretic movements that try to make indentations 4 (or even 2!) characters deep, and that is akin to trying to define the value of PI to be 3."

HA!

snlacks|11 years ago

I laughed too. He makes a good point about the amount of indentation in code.

As someone who spends most of their time in JavaScript, I see how hard it would be to fit our code to this, and at the same time how much we'd all benefit if we tried to.

I just looked at the random JS file on top of my editor... have some refactoring to do.

kijin|11 years ago

> spaces are never used for indentation

If indentation should always use tabs (0x09) and never spaces (0x20), then the whole rant about indentation width is pointless. Any modern editor will allow you to adjust the displayed width of a tab. It's only when you use spaces for indentation that the width becomes a concern.

repsilat|11 years ago

There are two counterarguments to this:

- Line length. Some people say lines should be no longer than 78-80 characters, and you can't reasonably enforce a rule like this without answering how "wide" a tab is.

- Alignment. The "right thing to do" is to indent with tabs and align with spaces, but this is difficult for some people, against the religion of others (mixing tabs and spaces!) and insufficient if you want to align text that spans different levels of indentation. If most people use 8-character wide tabs, things will at least look right for them when it inevitably goes wrong.

rossy|11 years ago

When you have a limit of 80 characters per line, the indentation width still matters because code that doesn't appear to overflow with 4 space tabs could overflow with 8 space tabs.

raverbashing|11 years ago

I like it

I really prefer using tabs. Having it displayed as 8 spaces in other languages is not as good as in C

And they get it right about typedefs in C

Dewie|11 years ago

Right, having tabs seems better since I can configure my editor to display tabs as 4 spaces, while whoever else can have tabs be displayed as 8 spaces. Having spaces instead, and having to make your tabs output spaces, and perhaps also backspace deleting four spaces (a "tab") in certain contexts, seems pretty complex in comparison.

Zardoz84|11 years ago

2 spaces to indent is enough for anyone.

BugsBunnySan|11 years ago

It's a very nice coding style. It keeps the code in pieces that are easy to grasp as units, it doesn't waste space and doesn't clutter the code at the same time.

Just take any random function from the kernel sources and ask yourself, what does it do. I think in most cases you'll find it's really obvious...

For me I find the kernel sources one of the most readable and understandable sources I've seen. The structure of them is just so clearly visible from the sources. I think a lot of that has to do with the coding style.

davidw|11 years ago

I'm a fan of the Tcl/Apache/BSD style. Indeed, Tcl has nice C code:

https://github.com/tcltk/tcl/blob/master/generic/tclCompile....

oneeyedpigeon|11 years ago

Nice, but they definitely overcomment IMO, e.g.

https://github.com/tcltk/tcl/blob/master/generic/tclCompile....

thatswrong0|11 years ago

> Do not unnecessarily use braces where a single statement will do.

    if (condition)
	    action();

> and

    if (condition)
	    do_this();
    else
	    do_that();

The Apple SSL bug (https://nakedsecurity.sophos.com/2014/02/24/anatomy-of-a-got...) makes me wonder if this is really worth the potential for introducing bugs.

DSMan195276|11 years ago

IMO, the bug was a much deeper issue then simply not putting braces on if statements. It doesn't matter if the code becomes this:

    if (condition) {
        goto fail;
    }
        goto fail;

if nobody looks at the commit. Don't get me wrong though, braces on if's do help for making cleaner patches, so there is a valid reason to request braces. You should never rely on them to fix these types of bugs though, that's bound to come back and bite you. In general, having a proper system for submitting and approving patches (Like the Kernel has) will allow you to avoid errors like this one.

RogerL|11 years ago

Any decent code analyzer will pick that kind of thing up. I'd say the Apple bug speaks more about their process than their coding standards.

whoisthemachine|11 years ago

I agree with and already practice many of these conventions (at least the ones that apply to C-like languages in general). It's interesting that I do and I kind of wonder what lead me down that path, since I haven't programmed in C since my college days. I often think that my assembly class from those days pushed me into making my code as vertical as possible rather than the indented-if-statement-curly-brace hell that I often see, since assembly was very readable without having that capability.

geekam|11 years ago

>> Do not unnecessarily use braces where a single statement will do.

Shouldn't this be changed to always use braces? Given the Apple bug?

dllthomas|11 years ago

The "Apple bug" - I assume you mean the duplicated "goto fail" - isn't really a common kind of error. That said, there are others that "braces everywhere" does protect from somewhat. The question is whether the clutter trades off too much in readability. Linus apparently thinks it does.

Nmachine|11 years ago

I stopped reading pretty early on: ... if you need more than three levels of indentation you're screwed and should rewrite...

101 comments