top | item 6984970

Dangerous Embedded C Coding Standard Rules (2011)

34 points| momo-reina | 12 years ago |embeddedgurus.com | reply

48 comments

order
[+] kabdib|12 years ago|reply
Well, let's see.

#1: Avoiding shifting versus doing division: Nope. Crappy compilers and bad architectures will yield bad results. Know what you're doing.

#2: Use a typedef instead of a bare C type, then. Also, on many systems space is king (the last embedded system I wrote had 6 bytes left over in code space, and 16-bit operations were incredibly expensive and nearly unaffordable).

#3: Just make the code clear. This rule is lunacy.

#4: Well, maybe. Except that variable initialization can often be made quite compact by the runtime (e.g., an expansion of a compressed data segment into RAM), and so save you code space.

#5: Again, the compiler can screw you here. Also, it's possible to take the address of a constant and have it be wiped out (for naïve implementations, anyway).

My basic rules:

#0: Don't make assumptions. Know what the heck is going on, fractally. Look at the generated code.

#1: Be a responsible engineer. Design and code for production and testing, and for the next guy. This often means not being clever, but if you are being clever, be forthright about it and don't leave your brilliance as a puzzle. (Don't be clever just to be clever. That will get you fired, no foolin').

#2: Avoid platitudes. Be wary of people who tell you not to use something, always, or to use a particular contruct, always. Be suspicious of code that follows someone else's dogma. [I once worked with a code base where the engineer had used the ternary -- "?:" -- operator everywhere because he thought it was more efficient than using 'if'. Of course it was just guesswork on his part and not only was his code a train wreck, but he had never measured his results. We fired that guy).

Embedded systems are just software, they're not magical, they're not (usually) that hard, and for some reason people are afraid of them. Some of my best moments in the industry have been writing these little things, they're fun as hell.

[+] gioele|12 years ago|reply
> #1: Avoiding shifting versus doing division: Nope. Crappy compilers and bad architectures will yield bad results. Know what you're doing.

Which compiler, among those still used in 2014, does not convert a division by 2 into a shift?

[+] deletes|12 years ago|reply
Why are agreeing with OP on rule #3 and then saying it is lunatic? Did I miss something?

Quote:Better Rule: Use whatever comparison operator is easiest to read in a given situation.

One of the very best things any embedded programmer can do is to make their code as readable as possible to as broad an audience as possible.

[+] sophacles|12 years ago|reply
This is one of those blog posts where the comments/discussion are really good - maybe even better than the article itself. I recommend reading the discussion on-page if you haven't.
[+] wtracy|12 years ago|reply
Another reason to pay attention to #2: Most CPUs perform simple arithmetic operations fastest on their native word size. On a 32-bit machine, addition and subtraction on 8-bit variables is probably slower than on 32-bit variables. (Often the hardware will transparently cast to 32 bits, perform the operation, then cast back to 8 bits.)

I'm not an authority on this topic, so I don't want you to take away the message "always use the machine word size". What I do want you to take away is that there are many cases where an "obvious" optimization is counterproductive in practice. Always start with simple, working code, and never optimize without benchmarking.

[+] mcguire|12 years ago|reply
With regards to Bad Rule #3, "Avoid >= and use <," if I saw

    if (speed < 100) ... else if (speed > 99)
in code, my only response would be to question the author's sanity.

But then, I question the sanity of C standards committee members all the time, specifically regarding his actual complaint related to Bad Rule #2 and unspecified-width integer types[1].

[1] http://www.netrino.com/Embedded-Systems/How-To/C-Fixed-Width...

[+] pascal_cuoq|12 years ago|reply
The phrase “C standards committee” is eminently ambiguous, as the standards that define the C language (e.g. ISO/IEC 9899) are the work of a committee. These committee members have constraints coming at them from all sides and do a very good job or maintaining compatibility with the past and future-proofing the C language at the same time.

This has nothing to do with coding standards, which any fool can define (and some have), and for which the only instance that really deserves the name of “standard” is MISRA-C. The other such “standards” have a scope typically limited to a company and its providers, and reflect the style and idiosyncrasies of the person who was in charge of defining it (they are rarely the work of a committee).

[+] optymizer1|12 years ago|reply
I'm not completely on board with #4 (initialization of variables). I agree that 'declaring and then assigning a value' _within the same block_ vs 'initialized declaration' has no speed gains, only downsides.

However, if the assignment can happen in a _different block_ (maybe inside an 'if' block) you could save 1 memory write, depending on how many times the if condition is satisfied.

This obviously optimizes for speed at the expense of maintainability, and it's for the programmer to make intelligent trade offs, but the fact is that one method is faster than the other.

Silly example that decrements 'a' repeatedly if condition is non-zero:

   //temp = 0 inside if block
   int dec(int a, int condition) {
       int temp;

       if (condition) {
           temp = 0;  //<-- memory write depends on condition

           //compute something here in a loop
           while (temp < a) {
               a -= (temp++);
           }
       }

       return a;
   }

   //temp = 0 at declaration
   int dec2(int a, int condition) {
       int temp = 0; // <-- memory write always executed

       if (condition) {
           //compute something here in a loop
           while (temp < a) {
               a -= (temp++);
           }
       }

       return a;
   }

We can disassemble the output to verify that dec() will only write to memory if the condition was satisfied (see 400458), while dec2() will always write into temp (see 400482):

    # <dec>
    400448:  push   %rbp
    400449:  mov    %rsp,%rbp
    40044c:  mov    %edi,0xffec(%rbp)
    40044f:  mov    %esi,0xffe8(%rbp)
    400452:  cmpl   $0x0,0xffe8(%rbp)  # if (condition)
    400456:  je     400473 <dec+0x2b>
    400458:  movl   $0x0,0xfffc(%rbp)  # temp = 0 <-- depends on condition
    40045f:  jmp    40046b <dec+0x23>  # while
    400461:  mov    0xfffc(%rbp),%eax
    400464:  sub    %eax,0xffec(%rbp)
    400467:  addl   $0x1,0xfffc(%rbp)
    40046b:  mov    0xfffc(%rbp),%eax
    40046e:  cmp    0xffec(%rbp),%eax
    400471:  jl     400461 <dec+0x19>  # loop
    400473:  mov    0xffec(%rbp),%eax  # return a
    400476:  leaveq
    400477:  retq

    # <dec2>
    400478:  push   %rbp
    400479:  mov    %rsp,%rbp
    40047c:  mov    %edi,0xffec(%rbp)   
    40047f:  mov    %esi,0xffe8(%rbp)
    400482:  movl   $0x0,0xfffc(%rbp)   # temp = 0 <-- always executed
    400489:  cmpl   $0x0,0xffe8(%rbp)   # if (condition)
    40048d:  je     4004a3 <dec2+0x2b>
    40048f:  jmp    40049b <dec2+0x23>  # while
    400491:  mov    0xfffc(%rbp),%eax
    400494:  sub    %eax,0xffec(%rbp)
    400497:  addl   $0x1,0xfffc(%rbp)
    40049b:  mov    0xfffc(%rbp),%eax
    40049e:  cmp    0xffec(%rbp),%eax
    4004a1:  jl     400491 <dec2+0x19>  # loop
    4004a3:  mov    0xffec(%rbp),%eax   # return a
    4004a6:  leaveq 
    4004a7:  retq
The above code was compiled with gcc 4.1.2 on amd64/linux. gcc -O2 and gcc -O3 completely do away with the 'temp' variable and generate fewer instructions.
[+] ant512|12 years ago|reply
I'd suggest that, if a variable with an undefined value is a valid state for the program, the declaration is in the wrong place. How about this alternative?

   int dec(int a, int condition) {
       if (condition) {
           int temp = 0;

           //compute something here in a loop
           while (temp < a) {
               a -= (temp++);
           }
       }

       return a;
   }
[+] nitrogen|12 years ago|reply
I'm not particularly familiar with x86-64 calling conventions. Where is the code that sets up the stack frame? Would this code look different if temp were created inside the if() block, not just initialized there?
[+] deletes|12 years ago|reply
Completely agree with no. 5. After a year with C I realized that macros are only to be used as control structures for headers and ifdefs to enable compatibility.

If you need macros to hack C to enable some functionality not inherent to the language, you should change the approach or switch to a different language.

Also macros are usually used to "speed up" the code. You should never optimize before finding the real bottleneck in your code. And thus always use functions instead of macros.

[+] Nursie|12 years ago|reply
Macros enable you to replace things at compile time.

The C preprocessor is a powerful and awesome thing by itself and I miss it when it's not available in other languages. It enables you to do things like pass line and file numbers automatically. For instance -

    #define log_message(format...) _log_message(__FILE__, __LINE__, format)
    void _log_message(char* filename, int line, char* format, ...);
And there's nothing inherently wrong with #define for frequently used magic numbers. They have the advantage of being pre-processed out rather than being separate variables at compile time.

Macros are also useful to simplify little repeated chunks of code that don't really need their own function.

I've not often seen macros used to try to speed up code.

--edit-- I don't know how long you've kept up with your C, but what you've realised after a year with the language doesn't seem as obvious to me after 15.

[+] theseoafs|12 years ago|reply
> If you need macros to hack C to enable some functionality not inherent to the language, you should change the approach or switch to a different language.

I think if you continue to do serious C hacking, you'll find a lot of places where macros are legitimately a good choice.

For example, I've found that macros are very, very useful when it comes to implementing error handling in a robust way. In C, the only real way to detect whether a function has failed is to return an error code. The caller then has to test for the error code and take the appropriate action if there was an error. I'm working on a side project now that was doing a LOT of this because it does a lot of IO and memory allocation (and basically any operation you do on a file or any call to `malloc` or a function that calls `malloc` can spontaneously fail in C). So my source code, after a while, ended up looking more or less like this:

    if ((f = open_file()) == NULL)
        return ERROR_CODE;
    if ((s = allocate_string()) == NULL)
        return ERROR_CODE;
    if (write_string_to_file(s, f) < 0)
        return ERROR_CODE;
... and so on, and so forth. It got to the point where basically all of the function calls I did needed to be manually checked for an error, which made all of my code really messy and hard to read. The only real way to refactor these checks in pure C without sacrificing robustness is to use a macro:

    #define MY_ASSERT(c) do { if (!(c)) return ERROR_CODE; } while (0)
    MY_ASSERT((f = open_file()) != NULL);
    MY_ASSERT((s = allocate_string()) != NULL);
    MY_ASSERT(write_string_to_file(s, f) >= 0);
This has the same effect as the code above, without sacrificing comprehensibility or readability.

> Also macros are usually used to "speed up" the code. You should never optimize before finding the real bottleneck in your code. And thus always use functions instead of macros.

I think a lot of people on HN and /r/programming fail to understand why premature optimization is a bad thing. A lot of programmers hide behind the "premature optimization is the root of all evil" thing to justify their pre-existing biases or their inefficient code. The point is that you shouldn't spend a great deal of time optimizing functions or algorithms unless optimizing those functions or algorithms will be productive. It's wasteful to spend a bunch of time optimizing your IO operations, for example, if 95% of the time is being spent in the database. The macro-vs-function debate has nothing to do with that. Writing a macro definition takes no more effort than writing a function definition, so you shouldn't be opposed to writing

    #define SOME_OPERATION(x) (((x) * 100) / 3 + 500)
rather than

    int some_operation(int x) { return (x * 100) / 3 + 500; } 
That isn't premature optimization, you're just guaranteeing that the compiler won't introduce the overhead of a function call.
[+] __david__|12 years ago|reply
Even better rule: Know the limitations of your target processor and compiler and code accordingly.

Example 1: Some (bad) compilers just ignore the const keyword (and therefore treat consts like normal variables). Use #defines with these compilers (or get a better compiler).

Example 2: Some (older, smaller) processors don't have an integer division instruction. Avoid division at all costs on these processors unless you are not worried about space or execution speed.

[+] twoodfin|12 years ago|reply
Some (older, smaller) processors don't have an integer division instruction. Avoid division at all costs on these processors unless you are not worried about space or execution speed.

...unless you know your compiler is capable of converting the division into cheaper operations via strength reduction, per your well-considered rule.

[+] Joeboy|12 years ago|reply
> Example 2: Some (older, smaller) processors don't have an integer division instruction. Avoid division at all costs on these processors unless you are not worried about space or execution speed.

Ok, so steer clear of division on ARM CPUs.

[+] dllthomas|12 years ago|reply
Regarding division and shifts, one thing worth noting is that you can now (with C11) bust out static assertions to make sure no one changes the count and the shift incompatibly. It's still probably not worth doing unless there's a good reason to make sure that particular operation is a shift rather than a multiply.