(no title)
stlee42 | 21 days ago
This explanation confused me too:
Each individual iteration: around 4x slower (register spilling)
Cache pressure: around 2-3x additional penalty (instructions do not fit in L1/L2 cache)
Combined over a billion iterations: 158,000x total slowdown
If each iteration is X percent slower, then a billion iterations will also be X percent slower. I wonder what is actually going on.
No comments yet.