Lesser known tricks, quirks and features of C

[+] ufo|3 years ago|reply

Fun fact about %n:

Mazda cars used to have a bug where they used printf(str) instead of printf("%s", str) and their media system would crash if you tried to play the "99% Invisible" podcast in them. All because the "% In" was parsed as a "%n" with some extra modifiers. https://99percentinvisible.org/episode/the-roman-mars-mazda-...

[+] tom_|3 years ago|reply

"format not a string literal" is one warning I always upgrade to an error. Dear reader: you should do this, too!

[+] rerdavies|3 years ago|reply

Fun fact about %n:

The %n functionality also makes printf accidentally Turing-complete even with a well-formed set of arguments. A game of tic-tac-toe written in the format string is a winner of the 27th IOCCC.

- sez wiki.

A not so fun fact:

Because the %n format is inherently insecure, it's disabled by default.

- MSVC reference.

[+] gdprrrr|3 years ago|reply

Also iPhones had a RCE uaing a WiFi Name that contained %s https://thehackernews.com/2021/07/turns-out-that-low-risk-io...

[+] GolangProject|3 years ago|reply

This is one of those annoying little problems that is easily picked up by the vet command (https://pkg.go.dev/cmd/vet) when writing Go code. There are, of course, many linters that do the same thing in C, but it's nice to have an authoritative one built in as part of the official Go toolchain, so everyone's code undergoes the same basic checks.

[+] unknown|3 years ago|reply

[deleted]

[+] milgra|3 years ago|reply

Very nice collection. My favorite C feature is actually a gcc/clang feature : the __INCLUDE_LEVEL__ predefined macro. It made me code&maintain my C projects exactly twice as fast as before because file count dropped to half : https://github.com/milgra/headerlessc .

[+] xigoi|3 years ago|reply

How does this help? It just moves the content of the .h file to the .c file, but you still need to Write Everything Twice.

[+] account42|3 years ago|reply

Is having two files really that much of a bother? I have my editor set switch between the .c(pp) and the .h with a keyboard shortcut and that seems easier than scrolling between declaration and definition when you want to change something.

[+] enriquto|3 years ago|reply

I love this. Somehow it feels more elegant than "header-only" libraries.

[+] 3836293648|3 years ago|reply

How do you handle third party headers?

[+] gavinray|3 years ago|reply

This is great!

[+] LegionMammal978|3 years ago|reply

> volatile type qualifier

> This qualifier tells the compiler that a variable may be accessed by other means than the current code (e.g. by code run in another thread or it's MMIO device), thus to not optimize away reads and writes to this resource.

It's dangerous to mention cross-thread data access as a use case for volatile. In standard C, modifying any non-atomic value on one thread, while accessing it on another thread without synchronization, is always UB. Volatile variables do not get any exemption from this rule. In practice, the symptoms of such a data race include the modification not being visible on the other thread, or the modified value getting torn between its old and new states.

[+] Jorengarenar|3 years ago|reply

`volatile` is one of the things we need to pay attention to when dealing with threads, but as you notice it's not the only one.

Eskil Steenberg talks about it at 12:42 in his talk Advanced C: The UB and optimizations that trick good programmers. [0]

[0]: https://youtu.be/w3_e9vZj7D8?t=762

[+] unknown|3 years ago|reply

[deleted]

[+] mid-kid|3 years ago|reply

These days it can be chained with _Atomic to achieve the desired effect. That said, oftentimes you need more serious synchronization mechanisms your library would provide.

[+] plugin-baby|3 years ago|reply

UB?

[+] tastysandwich|3 years ago|reply

"Expert C Programming: Deep C Secrets" is a really good book to learn a lot of C tricks and quirks, plus some history. I read it a few years ago and loved it.

I was a grad when I read it and remember annoying my older coworkers for a few weeks with little gotchas I picked up. "hey what do you think THIS example prints?" "Stop sending me these!"

[+] AceJohnny2|3 years ago|reply

Compound Literals in C are great. They're no surprise to anyone coming from more sophisticated languages, but I've never seen them used in the C codebases I've worked on.

What with C also allowing structures as return values, another rarely-used feature, they're really useful for allowing a richer API than the historical `int foo(...)` that so many people are used to seeing.

C has so much legacy that it's really hard for even decades-old (C99!) feature to impose themselves. Or perhaps that's MSVC's lagging support that's to blame :p

[+] icedchai|3 years ago|reply

For many, I think "C" is still "C89."

I remember working on a commercial project in the mid-2000's that still had #ifdefs for K&R C prototypes (meaning, pre-ANSI C.) This was a recent-ish project at the time, started in 2000. Were people going to go back in time and compile it on an old architecture? I doubt it.

C moves slow.

[+] pjmlp|3 years ago|reply

MSVC supports C11 and C17, minus the C99 stuff that was made optional in C11.

Anyway given the option, one should always favour C++ over C, if they care about secure code, which while not perfect it is much better than any C compiler will do.

[+] unknown|3 years ago|reply

[deleted]

[+] teo_zero|3 years ago|reply

I think the 5[arr] deserves more love.

  // traditional syntax:
  boxes[products[myorder.product].box].weight
  // index[array]:
  myorder.product[products].box[boxes].weight

[+] Jorengarenar|3 years ago|reply

Don't mind if I do

[+] foobiekr|3 years ago|reply

Most of these are pretty familiar if old enough but this is a wonderful list.

I didn’t know C23 was getting rid of trigraphs. That’s probably a good thing and easy to clean up if needed.

[+] Joker_vD|3 years ago|reply

Never quite understood why compound literals are lvalues, but fine, whatever, I guess, it's so that you can write "&(struct Foo){};" instead of "struct Foo tmp; &tmp;"... which, on a tangential note, reminds me about Go: the proposals to make things like &5 and &true legal in Go were rejected because "the implied semantics would be unclear" even though &structFoo{} is legal and apparently has obvious semantics.

[+] dantle|3 years ago|reply

Nice article. Saw a few things I wish I'd known about.

1. %n in printf would be handy when writing CLIs dealing w/ multiple lines or precise counts of backspaces.

2. Using enums as a form of static_assert() is a great idea (triggering a div by zero compiler error).

[+] gallier2|3 years ago|reply

Cool. Two of the tricks shown are from my contribution in stackoverflow.

[+] nstbayless|3 years ago|reply

Here's another one. Handy "syntax" that makes it possible to iterate an unsigned type from N-1 to 0. (Normally this is tricky.)

for (unsigned int i = N; i --> 0;) printf("%d\n", i);

This --> construction also works in JavaScript and so on.

[+] mtklein|3 years ago|reply

It's worth noting that this does also work on signed types, so it can be a kind of handy idiom to see

   while (N --> 0) { ... }

and know it will execute N times no matter the details of the type of N.

[+] titzer|3 years ago|reply

AFAICT this would parse as "(i--) > 0", there's no "-->" operator.

[+] stonegray|3 years ago|reply

If you're gonna test the i--, shouldn't it fall through on zero anyway?

    for (unsigned int i = N; i--;){}

    unsigned int i = N;
    while(i--){ ... }

Also I think I'm missing the tricky part. Couldn't this be a bog-standard for loop?

   for (unsigned int i = N - 1; i > 0; i--){ ... }

The "downto" pseudooperator definitely scores some points for coolness and aesthetics, but there's no immediately obvious use case for me.

[+] Jorengarenar|3 years ago|reply

I was hesitant to put it on the list, but fine, you convinced me

[+] unknown|3 years ago|reply

[deleted]

[+] skribanto|3 years ago|reply

how would you iterate over every possible value of a unsigned int?

[+] zabzonk|3 years ago|reply

the c training course at a popular uk training company (the instruction set) had duff's device on something like page 5 of their c course - expunging it was one of the first things i did when i joined them. there were many others.

[+] zwieback|3 years ago|reply

Be interesting to see when these features showed up. I learned C from the K&R book back in the day and it doesn't mention most of these.

Designated initializer is something I'll try to remember, seems handy.

[+] flohofwoe|3 years ago|reply

Designated init and compound literals were added in C99. I think there are two reasons for those features not being better known:

1) C++ 'forked' their C subset before C99 (ca. "C95"), and while C++20 finally got its own version of designated init, this has so many restrictions compared to C99 that it is basically pointless.

2) MSVC hasn't supported any important C99 features until around 2016

[+] AceJohnny2|3 years ago|reply

Yeah the K&R, while being a masterpiece of clarity and conciseness, is severely outdated in many important ways.

I wish there was some effort to create a modern version while preserving the clarity and conciseness of Kernighan and Ritchie.

Designated initializers in particular are extremely useful. I once halted a factory line for days because of a mistake they would have avoided.

[+] suprjami|3 years ago|reply

Designated initialisers were added in C99

[+] jokoon|3 years ago|reply

I'm currently trying to make a programming language that translates to C. I've started with a pseudo BNF parser.

I started reading the C BNF and I have to admit that I was not prepared at all. It's not as easy as it sounds.

I cannot imagine how difficult it must be to maintain a modern C++ compiler.

[+] vocram|3 years ago|reply

Yet another proof that C is simple but not easy.

[+] int_19h|3 years ago|reply

One non-obvious thing about named function types is that they can also be used to declare (but not define) functions:

   typedef void func(int);
   func f;
   void f(int) {}

I don't think I've ever seen a practical use for this in C, though. In C++, where this also works, and extends to member functions, this can be very occasionally useful in conjunction with decltype to assert that a function has signature identical to some other function - e.g. when you're intercepting and detouring some shared library calls:

    int foo();
    decltype(foo) bar;

I suppose with typeof() in C23 this might also become more interesting.

[+] ljosifov|3 years ago|reply

Great read, and lead me to "When VLA in C doesn't smell of rotten eggs" https://blog.joren.ga/vla-usecases and this:

  int n = 3, m = 4;
  int (*matrix_NxM)[n][m] = malloc(sizeof *matrix_NxM); // `n` and `m` are variables with dimensions known at runtime, not compile time
  if (matrix_NxM) {
      // (*matrix_NxM)[i][j] = ...;
      free(matrix_NxM);
  }

Well, that makes much easier a few things I'm doing atm, really glad I read it.

[+] augustk|3 years ago|reply

There are three macros which I find indispensable and which I use in all my C projects, namely LEN, NEW and NEW_ARRAY. I keep them in a file named Util.h:

  #ifndef UTIL_H
  #define UTIL_H

  #include <errno.h>
  #include <stdio.h>
  #include <stdlib.h>
  #include <string.h>

  #define LEN(arr) (sizeof (arr) / sizeof (arr)[0])

  #define NEW_ARRAY(ptr, n) \
     (ptr) = malloc((n) * sizeof (ptr)[0]); \
     if ((ptr) == NULL) { \
        fprintf(stderr, "Memory allocation failed: %s\n", strerror(errno)); \
        exit(EXIT_FAILURE); \
     }

  #define NEW(ptr) NEW_ARRAY(ptr, 1)

  #endif

With these in place, working with arrays and dynamic memory is safer, less verbose and readability is improved.

[+] Gibbon1|3 years ago|reply

With gcc you can use this to get the elements of an array.

   // will barf if fed a pointer
   #define sizeof_array(arr) \
       (sizeof(arr) / sizeof((arr)[0]) \
       + sizeof(typeof(int[1 - 2 * \
       !!__builtin_types_compatible_p(typeof(arr), \
       typeof(&arr[0]))])) * 0)

[+] Razengan|3 years ago|reply

I wish there was a language "between" assembly and C: basically assembly with some quality-of-life improvements.

Shortcuts to reduce redundant chores (like those multiple instructions to load one 64-bit number into an ARM register) but minimal "magic" or unintended consequences as in C. Things like maybe a function call syntax like:

CALL someFunc(R1: thingForRegister1, @R7: pushR7ThenPopOnReturn, R42: [memoryAddressForR42])

and the function might be defined as:

someFunc(R1 as localNameForR1, R7 as oneMoreThing, R42? as optionalArgument)

and so on. (but anyone could come up with better ideas than me)

[+] spc476|3 years ago|reply

You just need a decent assembler with macro support. Here's a few lines of code I have for 32-bit x86 code:

    PRINTF "main_task: sf=%p",ebp
    PRINTF " sf: 1=%d 2=%d 3=%d 4=%d",dword [ebp+8],dword [ebp+12],eax,ebx

I'm using NASM and it wasn't hard to write the macro to do so.

[+] xigoi|3 years ago|reply

I think QBE might be what you're looking for?

https://c9x.me/compile/

[+] Miserlou57|3 years ago|reply

“Quirks and features”

176 comments