top | item 47069873

(no title)

pseudohadamard | 11 days ago

I think it's a circular problem, the gcc developers are very insular and respond to outside input with anything from ignoring it to getting into long lawyeristic arguments why, if you squint at the text just right, their way is the only right way, which strongly discourages outside contributions. There's only so many hours in the day and arguing till you're blue in the face that silently mutating a piece of code into unexpected different code that always segfaults when run based on a truly tortured interpretation of two sentences of text gets old fast. The gcc devs would make great lawyers for bypassing things like environmental law, they'd find some tortuous interpretation of an environmental protection law that let them dump refinery waste into a national park and then gleefully do it because their particular interpretation of the law didn't prohibit it.

Contrast this with Linus' famous "we do not break userspace" rant which is the polar opposite of the gcc devs "we love to break your code to show how much cleverererer than you we are". Just for reference the exact quote, https://lkml.org/lkml/2012/12/23/75, is:

  And you *still* haven't learnt the first rule of kernel maintenance?  If a change results in user programs breaking, it's a bug in the kernel. We never EVER blame the user programs. How hard can this be to understand?  ... WE DO NOT BREAK USERSPACE!
Ah, Happy Fun Linus. Can you imagine the gcc devs ever saying "if we break your code it's a problem with gcc" or "we never blame the user?".

This really seems to be gcc-specific problem. It doesn't affect other compilers like MSVC, Diab, IAR, Green Hills, it's only gcc and to a lesser extent clang. Admittedly this is from a rather small sample but the big difference between those two sets that jumps out is that the first one is commercial with responsibilities to customers and the second one isn't.

discuss

order

uecker|10 days ago

In my experience it is worse with clang that even more aggressively uses UB than GCC to optimize (and Chris Lattner in his famous blog post very much justified this line of thinking), and I have seen similar things with MSCV. I do not know about the others.

I think that GCC changed a bit in recent years, but I am also not sure that an optimizing compiler can not have the same policy as the kernel. For the kernel, it is about keeping API's stable which is realistic, but an optimizing compiler inherently relies on some semantic interpretation of the program code and if there is a mismatch that causes something to break it is often difficult to fix. It is also that many issues were not caused because they decided suddenly "let's now exploit this UB we haven't exploited before" but that they always relied on it but an improved optimization now makes something affect more or different program. This creates a difficult situation because it is not clear how to fix it if you don't want to roll back the improvement you spend a lot of time on and others paid for. Don't get me wrong, I agree the went to far in the past in exploiting UB, but I do think this is less of a problem when looking forward and there is also generally more concern about the impact on safety and security now.

pseudohadamard|9 days ago

Good point, yeah. I really want to like clang because it's not gcc but they have been following the gcc path a lot in recent years. I haven't actually seen it with MSVC, but I'm still on an old pre-bloat version of Visual Studio so maybe they've got worse in recent versions too.

I think a lot of the UB though isn't "let's exploit UB", it's "we didn't even know we had UB in the code". An example is twos-complement arithmetic, which the C language has finally acknowledged more than half a century after the last non-twos-complement machine was built (was the CDC 6600 the last one's-complement machine? Were most of the gcc dev even born when that was released?). So everyone on earth has been under the crazy notion that their computer used twos-complement maths which the gcc (and clang) devs know is actually UB and allows them to do whatever they want with your code when they encounter it.