top | item 40131957

(no title)

ot1138 | 1 year ago

I had a section of code which incurred ~20 clock cycles to make a function call to a virtual function in a critical loop. That's over and above potential delays resulting from cache misses and the need to place multiple parameters on the stack.

I was going to eliminate polymorphism altogether for this object but later figured out how to refactor so that this particular call could be called once a millisecond. Then if more work was needed, it would dispatch a task to a dedicated CPU.

This was an incredibly performant improvement which made a significant difference to my P&L.

discuss

order

mgaunard|1 year ago

Could just be inefficient spilling caused by ABI requirements due to the inability to inline.

In general if you're manipulating values that fit into registers and work on a platform with a shitty ABI,you need to be very careful of what your function call boundaries look like.

The most obvious example is SIMD programming on Windows x86 32-bit.