top | item 45354325

(no title)

orlp | 5 months ago

Aarch64 does indeed have a proper atomic max, but even on x86-64 you can get a wait-free atomic max as long as you only need to support integers up to 64. In that case you can simply do a `lock or` with 1 << i as your maximum. You can even support larger sizes by using multiple registers, e.g. four 64-bit registers for a u8 maximum.

In most cases it's even better to just store a maximum per thread separately and loop over all threads once to compute the current maximum if you really need it.

discuss

order

jerrinot|5 months ago

That’s a neat trick, albeit with limited applicability given the very narrow range. Thanks for sharing!