top | item 31453283

(no title)

macdice | 3 years ago

FWIW I just removed pg_attribute_always_inline from those three comparator functions in tuplesort.c and it made no difference to the generated code under GCC 10 at -O2. If we can show that's true for Clang and MSVC too, maybe we should just take them out. Edit: But I don't really mind them myself, they do document the intention of the programmer in instantiating all these specialisations...

discuss

order

maccard|3 years ago

> FWIW I just removed pg_attribute_always_inline from those three comparator functions in tuplesort.c and it made no difference to the generated code under GCC 10 at -O2.

I experimented with this on various projects and found the same even under lower optimisation levels. I'm definitely in the camp of "profile profile profile", and the guideline that I use is "if it has internal linkage, the compiler will figure it out. If it has external linkage, then I'm profiling".

> they do document the intention of the programmer in instantiating all these specialisations...

Is the intent of the programmer to write them inline, or for them to be inlined for performance? marking something as __forceinline does the former, but implies the latter, when it's not necessarily true to say inline == faster in every case!