top | item 46186917

Show HN: Cdecl-dump - represent C declarations visually

35 points| bluetomcat | 2 months ago |github.com

A small tool that parses C declarations and outputs a simple visual representation at each stage, as it encounters arrays, pointers or functions.

The program uses a table-driven lexer and a hand-written, shift-reduce parser. No external dependencies apart from the standard library.

13 comments

order

xvilka|2 months ago

We use the tree-sitter[1] for parsing C declarations in Rizin[2] (see the "td" command, for example). See our custom grammar[3] (modified mainstream tree-sitter-c). The custom grammar was sadly necessary, due to the inability of Tree-Sitter to have the alternate roots[4].

P.S. Please add a license for your code.

[1] https://tree-sitter.github.io/

[2] https://github.com/rizinorg/rizin/tree/dev/librz/type/parser

[3] https://github.com/rizinorg/rizin-grammar-c/

[4] https://github.com/tree-sitter/tree-sitter/issues/711

coherentpony|2 months ago

I don’t understand what the visualisation screenshot in the README is trying to communicate to me.

bluetomcat|2 months ago

It starts from the identifier. At every stage, it outputs a sub-expression which is the “mirrored use” and corresponds to the boxed representation below it. When it reaches the top of the expression, it prints the final type of the expression which is the lone specifier-qualifier list.

As per the screenshot, “arr” is an array of 4 elements. Consequently, “arr[0]” is an array of 8 elements. Then, “arr[0][0]” is a pointer. And so on, until we arrive at the specifier-qualifier list.

pcfwik|2 months ago

Since this is about C declarations: for anyone who (like me) had the misfortune of learning the so-called "spiral rule" in college rather than being taught how declarations in C work, below are some links that explain the "declaration follows use" idea that (AFAIK) is the true philosophy behind C declaration syntax (and significantly easier to remember/read/write).

TL;DR: you declare a variable in C _in exactly the same way you would use it:_ if you know how to use a variable, then you know how to read and write a declaration for it.

https://eigenstate.org/notes/c-decl https://news.ycombinator.com/item?id=12775966

userbinator|2 months ago

if you know how to use a variable, then you know how to read and write a declaration for it.

In other words, the precedence of operators in a declaration have exactly the same precedence as in its use.

nitrix|2 months ago

That is correct.

  int x, *p, arr[5], fn(), (*pfn)();
Using x, or dereferencing p, or subscripting the array arr, or declaring a function that can be called with fn, or dereferencing the function pointer pfn then calling it, all these things would produce an int.

It's the intended way to read/write declarations/expressions. As a consequence, asterisks ends up placed near the identifiers. The confused ones will think it's a stylistic choice and won't understand any of this.