top | item 28026612

Learning that you can use unions in C for grouping things into namespaces

167 points| deafcalculus | 4 years ago |utcc.utoronto.ca

147 comments

order

10000truths|4 years ago

Anonymous nested structs are also quite useful for creating struct fields with explicit offsets:

    #include <stdio.h>
    #include <stdint.h>
    
    #define YDUMMY(suffix, size) char dummy##suffix[size]
    #define XDUMMY(suffix, size) YDUMMY(suffix, size)
    #define PAD(size) XDUMMY(__COUNTER__, size)
    
    struct ExplicitLayoutStruct {
        union {
            struct __attribute__((packed)) { PAD(3); uint32_t foo; };
            struct __attribute__((packed)) { PAD(5); uint16_t bar; };
            struct __attribute__((packed)) { PAD(13); uint64_t baz; };
        };
    };
    
    int main(void) {
        // offset foo = 3
        // offset bar = 5
        // offset baz = 13
        printf("offset foo = %d\n", offsetof(struct ExplicitLayoutStruct, foo));
        printf("offset bar = %d\n", offsetof(struct ExplicitLayoutStruct, bar));
        printf("offset baz = %d\n", offsetof(struct ExplicitLayoutStruct, baz));
        return 0;
    }

WalterBright|4 years ago

Anytime macros are used for metaprogramming, it's time to reach for a more powerful language.

gumby|4 years ago

One of the very few things from C that I miss in C++ is anonymous structs and enums. I really don’t understand why they are not allowed.

That is, C style enums don’t have to have a name but “type safe” (enum class) ones do. One classic use is to name an otherwise boolean option in a function signature; there’s typically no need to otherwise name it.

C++ incompatibly requires a name for all struct and class declarations, again a waste when you will only have a single object of a given type.

oshiar53-0|4 years ago

IMO you can still be explicit about field offsets by writing the struct in a usual way, and using static assertions to ensure offsets match the intended layout.

nyanpasu64|4 years ago

Do foo and bar deliberately overlap?

midjji|4 years ago

There are two kinds of undefined behaviour being invoked in using this. Its a horrible idea and a horrible code smell, get rid of it if you ever see something like this.

flohofwoe|4 years ago

I'm using anonymous nested structs extensively for grouping related items, but I consider the extra field name a feature, not something that should be hidden:

https://github.com/floooh/sokol-samples/blob/bfb30ea00b5948f...

(also note the 'inplace initialization' which follows the state struct definition using C99's designated initialization)

kevin_thibedeau|4 years ago

The result is uglier and less maintainable than a pair of macros. Or just stop trying to hide syntax. This is ultimately on the same level as typedefing pointers.

remram|4 years ago

The first example seems wrong, instead of `struct sub { ... };` what is meant is `struct { ... } sub;`

siebenmann|4 years ago

You're right; thanks for noticing and I've updated the first example. My C is a bit rusty these days and I didn't check it with a compiler the way I should have.

(I'm the author of the linked-to article.)

sesuximo|4 years ago

Doesn’t matter for C, but in C++ this could make your contexpr functions UB since you can only use one member of a union in constexpr contexts (the “active” member).

pjmlp|4 years ago

In C++ we have namespaces for 30 years now, no need for such tricks.

midjji|4 years ago

Constexpr unions is the sane/safe way to use them. Its great, because accessing a member which isnt the last one written, constexpr will explicitly prevent it compile time. Whereas all other examples here are explicitly undefined behaviour!

bruce343434|4 years ago

Imo this is not “perverse”. In my vector library I alias a vec3 as float x,y,z and float[3] using this technique.

midjji|4 years ago

This is also known as the most common invocation of undefined behaviour in game programming. If you do this, write to y, then read from [1]. You are invoking undefined behaviour, and compilers doing different things here between windows, linux mac, and different compiler versions is a common cause of "why isnt my game working right on XXX, it works fine on YYY questions.

PaulHoule|4 years ago

The C programming language, brought to you by Cthulhu.

You don't need eval(), you've got strcpy()!

rightbyte|4 years ago

I don't regard this as a "perverse" hack. If I ever do embedded memory mapped stuff in C11 this is way too tempting.

midjji|4 years ago

You are practically guaranteed to invoke undefined behaviour if you do. Just use a map on a std::array of e.g. std::byte

Subsentient|4 years ago

Bleurgh. I have a deep soft spot for C, and I'm known to get twisted pleasure from using obscure language features in new ways to annoy people, but this is a level of abuse that even I can't get behind. If you need namespacing, use C++. As much as I love C, it's terrible for large projects.

vbezhenar|4 years ago

Linux kernel is large project and clearly C is sufficient for it, given the fact that migrating to C++ would probably be very easy (not using all C++ features, but just selected ones), yet it did not happen.

I think that C++ is better than C, but C is not that bad, even for large projects.

kktkti9|4 years ago

People will make a mess of a large project regardless of the language.

midjji|4 years ago

This is probably a terrible idea, remember that if you have written one member of a union, all other members remain public, yet accessing any of them in any way is undefined behaviour. This is made way worse by most compilers mostly choosing to let you do what you think it will. They just dont guarantee they always will or in all cases.

drfuchs|4 years ago

I believe you are mistaken. The C11 standard, section 6.5.2.3 "Structure and union members" pgf 6, says "One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members." And that seems to be what's being used here.