top | item 40758566

(no title)

joosters | 1 year ago

Is this really C++ specific though? It seems like the optimisations are happening on a lower level, and so would 'infect' other languages too.

Whatever the language, at some point in performance tweaking you will end up having to look at the assembly produced by your compiler, and discovering all kinds of surprises.

discuss

order

tialaramex|1 year ago

LLVM isn't perfect, but the problem here is that there's a C++ compiler flag (-ffast-math) which says OK, disregard how arithmetic actually works, we're going to promise across the entire codebase that we don't actually care and that's fine.

This is nonsense, but it's really common, distressingly common, for C and C++ programmers to use this sort of inappropriate global modelling. It's something which cannot scale, it works OK for one man projects, "Oh, I use the Special Goose Mode to make routine A better, so even though normal Elephants can't Teleport I need to remember that in Special Goose Mode the Elephants in routine B might Teleport". In practice you'll screw this up, but it feels like you'll get it right often enough to be valuable.

In a large project where we're doing software engineering this is complete nonsense, now Jenny, the newest member of the team working on routine A, will see that obviously Special Goose Mode is a great idea, and turn it on, whereupon the entirely different team handling routine B find that their fucking Elephants can now Teleport. WTF.

The need to never do this is why I was glad to see Rust stabilize (e.g) u32::unchecked_add fairly recently. This (unsafe obviously) method says no, I don't want checked arithmetic, or wrapping, or saturating, I want you to assume this cannot overflow. I am formally promising that this addition is never going to overflow, in order to squeeze out the last drops of performance.

Notice that's not a global flag. I can write let a = unsafe { b.unchecked_add(c) }; in just one place in a 50MLOC system, and for just that one place the compiler can go absolutely wild optimising for the promise that overflows never happen - and yet right next door, even on the next line, I can write let x = y + z; and that still gets the kid gloves, if it overflows nothing catches on fire. That's how granular this needs to be to be useful, unlike C++ -ffast-math.

gpderetta|1 year ago

You can set fast math (or a subset of it) on a translation unit basis.

nikic|1 year ago

LLVM actually also supports instruction-level granularity for fast-math (using essentially the same mechanism as things like unchecked_add), but Clang doesn't expose that level of control.

nickelpro|1 year ago

If you're using floating point at all you have declared you don't care about determinism or absolute precision across platforms.

Fast math is simply saying "I care even less than IEEE"

This is perfectly appropriate in many settings, but _especially_ video games where such deterministic results are completely irrelevant.