top | item 30277050

(no title)

dpratt71 | 4 years ago

Stupid question: Why isn't such logic incorporated into the call itself?

discuss

order

ww520|4 years ago

Good question. It's for performance reason. Calling into kernel is expensive. Lock acquired in the user mode is much faster than lock acquired in kernel.

A typical lock acquisition using futex looks like:

  (a) while !compare_and_swap(&lock, 0, 1) // see if flag is free as 0, lock it as 1  
  (b)     futex(&lock, WAIT, 1)            // sleep until flag changes from 1  
(a) runs in user mode and it's very fast. It's just one CMPXCHG assembly instruction. If the lock is free, it's acquired in one instruction as 1 (locked).

If the lock is not free, then do the expensive call into the kernel to sleep via futex at (b). Futex() helps in detecting the change of the value while putting the thread to sleep, to avoid hogging the CPU.

RustyRussell|4 years ago

Importantly, prior to this (and, hell, even since) state of the art was to try atomically, then yield (or sleep(0)) then try usleep.

The kernel had no idea what was going on, so had no idea how to schedule such a thing. It particularly didn't know to wake you when the lock (which it has no idea about) became available.

chrchang523|4 years ago

It is worth noting here that this "one assembly instruction" is not that cheap. The hardware on a multicore system does have to perform some locking under the hood to execute that instruction. But yes, it still has enough of an advantage over calling into the kernel to justify the additional usage complexity.

derefr|4 years ago

The gettimeofday(3) vDSO is pure-userspace code. Why not, then, a futex(3) vDSO that does a while + compare_and_swap(2) in userspace, but then contains a real call to the futex(3) syscall?

oofabz|4 years ago

Incorporating thread-safety into individual calls is not sufficient, because thread-safety is not composable. A sequence of calls to thread-safe operations is not itself thread-safe. Since you need locks in the outermost application level, it would be wasteful and unnecessary to also lock inside each individual call.

This non-composability is why multithreading is so difficult. A library can't abstract away these concerns for you, unless the entire multithreaded operation is one call to the library, which requires the library to know and design for your exact use case. If you want to do something a library didn't plan for, you are required to handle the thread synchronization yourself.

58028641|4 years ago

Haskell’s STM offers composable concurrency