These are synthetic benchmarks but it's quite significant in them.
From a different tweet:
> It's the total time for 32 threads each doing 10'000 lock+unlocks (on a 64C/128T threadripper). So, the numbers you quoted correspond to a lock+unlock operation going from 8.75ns to 2.45ns, under low contention.
> The numbers can vary a lot in different situations/hardware though.
I think the focus is on the synchronization and implementation choice based performance differences, https://twitter.com/m_ou_se/status/1526211117651050497 which are not super easy to characterize but come from much more than just removing an allocation.
> you're often going to be better off eliminating the Arc/Mutex anyway
Not always. Mutexes can be really fast (10-20ns), especially since they often optimistically spin, and Arc in Rust is (often) relatively low cost since you can hand out "free" refs without touching the atomic.
If removing the Arc/Mutex would require allocations the Arc/Mutex could easily be faster.
staticassertion|3 years ago
These are synthetic benchmarks but it's quite significant in them.
From a different tweet:
> It's the total time for 32 threads each doing 10'000 lock+unlocks (on a 64C/128T threadripper). So, the numbers you quoted correspond to a lock+unlock operation going from 8.75ns to 2.45ns, under low contention.
> The numbers can vary a lot in different situations/hardware though.
kzrdude|3 years ago
loeg|3 years ago
staticassertion|3 years ago
Not always. Mutexes can be really fast (10-20ns), especially since they often optimistically spin, and Arc in Rust is (often) relatively low cost since you can hand out "free" refs without touching the atomic.
If removing the Arc/Mutex would require allocations the Arc/Mutex could easily be faster.
jstimpfle|3 years ago