(no title)
skitter | 9 months ago
As a result, uncontended locks work the same as described in the blog post above; under contention, performance is similar to a futex too. But now your locks are only one byte in size, regardless of platform – while Windows allows 1-byte futexes, they're always 4 bytes on Linux and iirc Darwin doesn't quite have an equivalent api (but I might be wrong there). You also have more control over parked threads if you want to implement different fairness criteria, reliable timeouts or parking callbacks.
One drawback of this is that you can only easily use this within one process, while at least on Linux futexes can be shared between processes.
I've written a blog post[2] about using futexes to implement monitors (reëntrant mutexes with an associated condvar) in a compact way for my toy Java Virtual Machine, though I've since switched to a parking-lot-like approach.
[0]: https://github.com/amanieu/parking_lot [1]: https://webkit.org/blog/6161/locking-in-webkit [2]: https://specificprotagonist.net/jvm-futex.html
jcranmer|9 months ago
That's not a very useful property, though. Because inter-core memory works on cache-line granularities, packing more than one lock in a cache line is a Bad Idea™. Potentially it allows you to pack more data being protected by a lock with that data... but alignment rules means that you're going to invariably end up spending 4 or 8 bytes (via a regular integer or a pointer) on that lock anyways.
vlovich123|9 months ago
zozbot234|9 months ago
gpderetta|9 months ago
mandarax8|9 months ago
With 4 byte locks your run into the exact same false sharing issues.
gmokki|9 months ago
Double checks. Nope. The api is there and the patch to implement them has been posted multiple times: https://lore.kernel.org/lkml/20241025093944.707639534@infrad...
But the small futex2 patch will not go forward until some users say they want/need the feature