I may be completely out of line here, but isn't the story on ARM very very different? I vaguely recall the whole point of having stuff like weak atomics being that on x86, those don't do anything, but on ARM they are essential for cache coherency and memory ordering? But then again, I may just be conflating memory ordering and coherency.
jeffbee|4 months ago
The instructions to which you refer are not atomics, but rather instructions that influence the ordering of loads and stores. x86 has total store ordering by design. On ARM, the program has to use LDAR/STLR to establish ordering.
phire|4 months ago
Memory ordering has nothing to do with cache coherency, it's all about what happens within the CPU pipeline itself. On ARM reads and writes can become reordered within the CPU pipeline itself, before they hit the caches (which are still fully coherent).
ARM still has strict memory ordering for code within a single core (some older processors do not), but the writes from one core might become visible to other cores in the wrong order.
gpderetta|3 months ago
gpderetta|3 months ago