No more confusions on tricky C declarations

[+] tptacek|15 years ago|reply

I've always had a bit more luck with the "typedef each step of the construction" rule-of-thumb. Also, I tend to hide anything as complex as "pointer-to-array-of-pointers-to-functions" (even though you memorize this idiom pretty quickly after an hour in the kernel) behind library ADT's, so you're never indexing an array, but rather passing an index and a whatever_t* to whatever_get(w, index).

[+] scott_s|15 years ago|reply

That's a good approach, but sometimes you can't do it in C++. (At least with C++03.) Consider:

  template <class T>
  T* new_align_1d(size_t d1, unsigned int align)
  {
    void* ptr = _malloc_align(d1 * sizeof(T), align);
    return new (ptr) T[d1];
  }

So, I'm defining a function new_align that takes a size and an alignment. I allocate enough space for a 1-dimensional array of the given size on that alignment, then use the placement operator to construct the actual objects in that place. Then return the pointer. Pretty straightforward. So let's generalize one dimension up:

  template <class T, size_t d2>
  ? new_align_2d(size_t d1, unsigned int align)
  {
    void* ptr = _malloc_align(d1 * d2 * sizeof(T), align);
    return new (ptr) T[d1][d2];
  }

You can probably see what I'm doing here. So what's the return type? And what's the syntax for specifying it? Since it's a template, we can't define a typedef to help us. (C++0x should allow that with parameterized typedefs.) The answer surprised me - I really had to look at the grammar and mechanically figure out what it was going to be.

(Yes, I have to pass d2 as a template parameter - it's part of the type of the array and must be known at compile time.)

[+] rntz|15 years ago|reply

This is a Good Idea, but it doesn't help you understand other people's code if they don't follow it. Hence the need for a mnemonic rule.

[+] jakevoytko|15 years ago|reply

A coworker introduced me to the ADT-style solution for nesting problems, and it works nicely. He introduces a struct that encapsulates any nested inner collection. It's worth noting his designs use fat structs in a shallow hierarchy.

This has an added advantage (and burden) when calling the code. Since you access a field inside of the struct, you write extra names. The claim is that myList[0].fnList[0].fn(3) is more readable than myList[0][0](3). This can get cluttered with lots of nesting, but this kind of nesting usually screams "refactor me!" anyways.

I prefer a mix: make the common inner collections into data types, and typedef the rest. Doing it all the time creates too many little structs for my liking.

[+] joe_the_user|15 years ago|reply

The second approach, that's what I like...

[+] loup-vaillant|15 years ago|reply

An ML like notation would be even more cool:

  char *str[10];
  str : [10] (*char)

  char *(*fp)( int, float *);
  fp : *((int, *float) -> *char)

  void (*signal(int, void (*fp)(int)))(int);
  signal : (int, *(int -> void)) -> *(int -> void)

(Oh. That last declaration did make some sense, after all…)

Really, how did they manage to chose such an inconsistent, unreadable syntax for their declarations? Is there any rational explanation?

[+] fexl|15 years ago|reply

The rationale is that the type declaration demonstrates how you use the variable. For example:

  char *(*fp)(int, float *)

You now have a variable called "fp". Follow that pointer by putting a * in front of it. Call that function by passing it an int and a float pointer inside parentheses. Follow that pointer by putting a * in front of it. That gives you a char.

Same kind of thing here:

  char *strings[10]

You now have a variable called "strings". Index that array by putting an offset less than 10 inside square brackets. Follow that pointer by putting a * in front of it. That gives you a char.

Here's a simpler example:

  char *str

You now have a variable called "str". Follow that pointer by putting a * in front of it. That gives you a char.

Here's the simplest example of all:

  char ch;

You now have a variable called "ch". That gives you a char.

[+] endgame|15 years ago|reply

AIUI, the explanation is that they wanted this idea of "declaration follows use". For simple things, this works kind of nicely:

    char *foo;

Says that *foo will have type `char`. This sounds good in theory, but famously breaks down when things get more complicated (arrays of function pointers, multiple declarations at once, &c.).

[+] jerf|15 years ago|reply

This was on HN a bit ago, and you reminded me of it: http://www.csse.monash.edu.au/~damian/papers/HTML/ModestProp...

Basically a fully-fleshed out version of your idea.

[+] rntz|15 years ago|reply

Fails on nested arrays.

  char *foo[10][20];

The method described would indicate that this is a array 10 of pointers to array 20s of chars.

This is incorrect. It is an array 10 of array 20s of pointers to chars.

[+] tordek|15 years ago|reply

How so?

         +-----------+
         | +-+       |
         | ^ |       |
    char *foo[10][20];
     ^   ^   |       |
     |   +---+       |
     +---------------+

* foo is

* an array of ten arrays of 20

* pointers to

* char

[+] jeffmax|15 years ago|reply

http://cdecl.org/

[+] javert|15 years ago|reply

"So is somebody gonna create a tool that uses this parser so you can declare your C in English?" -My roommate

[+] exit|15 years ago|reply

this is fantastic, but apparently "void (signal(int, void (fp)(int)))(int);" is a syntax error?

[+] Amnon|15 years ago|reply

I don't see where the spiral comes in. The following rule is simpler:

(1) Begin at the variable name, read from left to right, then go back to the name and read from right to left.

(2) Give precedence to expressions in parentheses.

For example: char (fp)( int, float * )

The innermost expression is (* fp). Nothing to the right of the fp. so read to the left: "* ", it's a pointer. Next, we go right and see an arguments list, so it's a pointer to a function taking these arguments. Go back to where we started and read right to left: Pointer to a function taking (int,float *) that returns a pointer to char.

[+] ashishb4u|15 years ago|reply

left-to-right and right-to-left is spiral infact :)

[+] rue|15 years ago|reply

    char* str[10]; /* Better */

[+] gmartres|15 years ago|reply

I disagree, the following:

  int* foo, bar;

could be interpreted as declaring two pointers, whereas:

  int *foo, bar;

makes it clear that only the first variable is a pointer.

[+] milod|15 years ago|reply

I've always found putting the * next to the type instead of the variable, like this, more intuitive. Does anyone know why the other way is used more often? Is it just historic, or is there a more practical reason?

[+] MtL|15 years ago|reply

Meh! This is just an overly complication of the right-left rule, which makes you think about "complex" 2D geometry instead of a couple of simple spatial pointers in the declaration you are trying to parse..

The easier, more useful version: http://ieng9.ucsd.edu/~cs30x/rt_lt.rule.html

[+] joe_the_user|15 years ago|reply

It is amazing to me the number of people who would take the time to make ASCII graphics in their replies.

I'm blessed that even munging 15+ Linux c/c++ libraries lately I haven't run into anything requiring this - though my Intro to C class, at Merit College twenty five years ago did teach to this rule.

41 comments