(no title)
phaker | 2 years ago
I think this should give a more staircase-y plot, with jumps when the size of each individual copy passes whole multiple of cache line width.
I think this way you don't need to fight prefetching, and there will be one read-modify-write per operation except right at the edges which is another signal.
unknown|2 years ago
[deleted]