(no title)
charleslmunger | 2 months ago
If your sections are that short then you can use a hybrid mutex and never actually park. Unless you're wrong about how long things take, in which case you'll save yourself.
>alignas(64) in C++
std::hardware_destructive_interference_size
Exists so you don't have to guess, although in practice it'll basically always be 64.The code samples also don't obey the basic best practices for spinlocks for x86_64 or arm64. Spinlocks should perform a relaxed read in the loop, and only attempt a compare and set with acquire order if the first check shows the lock is unowned. This avoids hammering the CPU with cache coherency traffic.
Similarly the x86 PAUSE instruction isn't mentioned, even though it exist specifically to signal spin sections to the CPU.
Spinlocks outside the kernel are a bad idea in almost all cases, except dedicated nonpreemptable cases; use a hybrid mutex. Spinning for consumer threads can be done in specialty exclusive thread per core cases where you want to minimize wakeup costs, but that's not the same as a spinlock which would cause any contending thread to spin.
raggi|2 months ago
Very much this. Spins benchmark well but scale poorly.
magicalhippo|2 months ago
Yeah, pure spinlocks in user-space programs is a big no-no in my book. If you're on the happy path then it costs you nothing extra in terms of performance, and if you for some reason slide off the happy path you have a sensible fall-back.
charleshn|2 months ago
Unfortunately it's not quite true, do to e.g. spacial prefetching [0]. See e.g. Folly's definition [1].
[0] https://community.intel.com/t5/Intel-Moderncode-for-Parallel...
[1] https://github.com/facebook/folly/blob/d2e6fe65dfd6b30a9d504...
menaerus|2 months ago
surajrmal|2 months ago
menaerus|2 months ago
glibc pthread mutex uses a user-space spinlock to mitigate the syscall cost for uncontended cases.
charleslmunger|2 months ago
imtringued|2 months ago
nly|2 months ago
saagarjha|2 months ago
Of course, this is just the number the compiler thinks is good. It’s not necessarily the number that is actually good for your target machine.
nly|2 months ago
Most people using spinlocks really care about latency, and many will have hyperthreading disabled to reduce jitter
SkiFire13|2 months ago