top | item 46806178

(no title)

Lectem | 1 month ago

The issue with that is that a load fence may be very detrimental to perf. It doesn't really matter if rdtsc executes out of order in this code anyway, and there is no need for sync between cores.

discuss

order

rdtsc|1 month ago

You could first measure the perf impact of the fence instruction and then subtract that out? But yeah I guess it may not matter much for quick and dirty calibration loop.

I found somewhere (https://aloiskraus.wordpress.com/2018/06/16/why-skylakex-cpu...) that the pause instruction had this wild cycle difference between different CPU and it caused some grief, I had no idea. I stopped doing low level coding a while back.