top | item 27482377

(no title)

tardyp | 4 years ago

I wonder why they didn't just make two separate routines without any branch except for loop. Then you would just make another safety offset to assume for the last loop iteration prediction.

discuss

order

brucedawson|4 years ago

That would _probably_ be safe, but you have to be sure that the xdcbt instructions in the "special" function are far enough into the function that speculative execution can never reach there. Pipelines are way deeper than most people realize so this might require a lot of instructions before the first xdcbt.

And then, for maximum performance you want prefetch instructions to be as early as possible. So, you immediately have a contradiction.

And, assuming that you resolve this there is still the risk that a mispredicted indirect branch could end up triggering an xdcbt. So, you end up with no guarantees anyway.

saagarjha|4 years ago

Perhaps an indirect branch elsewhere may be spuriously predicted to go to the instruction?