top | item 45356562

(no title)

thequux | 5 months ago

I think that this can change the semantics though; with the preceding check you can miss the shared variable being decremented from another thread. In some cases, such as if the shared value is monotonic, this is done, but not in the general case.

discuss

anematode|5 months ago

With a relaxed ordering I'm not sure if that's right, since the ldumax would have no imposed ordering relation with the (atomic) decrement on another thread and so could very well have operated on the old value obtained by the non-atomic load

gpderetta|5 months ago

All operations on a single memory location are always totally ordered in a CC system, no matter how relaxed the memory model is.

Also am I understanding it correctly that n is the number of threads in your example? Don't you find it suspicious that the number of operations goes up as the thread count goes up?

edit: ok, you are saying that under heavy contention the check avoids having to do the store at all. This is racy, and whether this is correct or not, would be very application specific.

edit2: I thought about this a bit, and I'm not sure i can come up with a scenario where the race matters...

edit3: ... as long as all threads are only doing atomic_max operations on the memory location, which an implementation can't assume.

ibraheemdev|5 months ago

It does make a difference of course if you're running fetch_max from multiple threads, adding a load fast-path introduces a race condition.