top | item 16996668

The Problem with Threads (2006) [pdf]

80 points| DonbunEf7 | 7 years ago |www2.eecs.berkeley.edu

55 comments

jgtrosh|7 years ago

This piece seems to have predicted a very active field in everyday software development since then.

What are the alternative paradigms that have actually become common use? Coroutines, async/await, that's what I hear about online but what are others? I've seen people who touted zmq-communicating-processes with standard patterns as the solution to all problems, and I'm happy not to have to maintain the results.

Have we effectively “solved” the concurrency problem, and if so what's left as an exercise for the future?

chrisseaton|7 years ago

Although this paper talks initially about concurrency, you can see that really he's talking about concurrency specifically for the purpose of parallelism.

Coroutines don't solve the parallelism part, because they're concurrent but exclusive.

Async/await as implemented in JavaScript doesn't solve the parallelism part either for the same reason, and async/await as implemented in C# has exactly the same problem as threads.

There are many ideas for how to solve the problem - but I think anyone who is honest will tell you none are a perfect solution to all situations where you want parallelism or concurrency.

For example to use zmq-communicating-processes effectively you need a problem where you can divide the data or tasks cleanly a-priori. We simply don't have the mathematical understanding of how to do that to some important algorithms that people really need to run in parallel today, such as triangulation or mesh refinement.

We probably need some radical new idea, or maybe it's looking increasingly like only a mix of ideas will work.

toast0|7 years ago

At risk of being that guy, the actor model (ala Erlang) is pretty good at concurrency. If you're unfamiliar, it's basically no shared state, and communication with other actors (Erlang processes) by sending asynchronous messages to the other actor's message queue.

The code for each actor is usually pretty small and easy to reason about. However, emergent behavior of the system, and ordering between messages from multiple actors can become tricky. Also, exposure to this idea long term will warp your mind :)

pcwalton|7 years ago

Data parallelism in CUDA, OpenGL, and other GPU APIs is doing fantastically and has for decades. (If writes are allowed, these APIs technically have the same problems as threads, but in practice they're easier to deal with since traditional mutex locks and condition variables are mostly unavailable in that environment, and the APIs force you to carefully declare the sharing semantics of your data buffers.)

Most parallel (not concurrent) problems map well to the data parallel model. Even Make is basically a data parallel API with read-only constant data, just with a more complex dependency graph.

barbegal|7 years ago

Promises in javascript have become quite popular. Unfortunately, they're not understood very well so they aren't being used much in areas where they can improve the performance of javascript applications and instead are being used to reduce nested callbacks.

nickpsecurity|7 years ago

On the parallelism side, it might bring ideas to look at languages attempting it so far. IBM's X10, Cray's Chapel, and Taft's ParaSail come to mind.

zzzcpan|7 years ago

What do you mean by "the concurrency problem"?

rectang|7 years ago

I've always liked section 3 of this paper, specifically the concept that "infinite interleavings" make threads executing in parallel non-deterministic and difficult to reason about. That gets to the heart of why threaded programs are so prone to heisenbugs.

"They make programs absurdly nondeterministic, and rely on programming style to constrain that nondeterminism to achieve deterministic aims."

You can't write an infinite number of test cases for all those interleavings, and it requires hard thought to suss out where any problems might lie.

chubot|7 years ago

This talk was about Foundation DB was brought up recently, and it's pretty amazing. I recommend watching the whole thing, but to be brief they are taming the "infinite interleavings" problem through determinism.

"Testing Distributed Systems w/ Deterministic Simulation" by Will Wilson

https://www.youtube.com/watch?v=4fFDFbi3toc

They wrote an interesting Actor DSL that compiles to C++ and is completely deterministic, and they torture this deterministic engine with generated test cases on a cluster every night.

I guess you could say that the whole cluster is necessarily non-deterministic, but an individual node is deterministic, given an ordering of the messages it receives.

jlarocco|7 years ago

This is just my opinion, but I've never found that part of multi-threading difficult. Interleaving doesn't matter except where resources are shared between multiple threads, and the solution is to protect the resource with a mutex.

Sometimes it's hard to tell when a resource is shared, but that has more to do with not knowing how the code works than it does with multi-threading.

vvanders|7 years ago

Ayup.

My favorite one is where adding debug traces causes the heisenbug to disappear because the printf() inserted a memory fence somewhere deep in the logging library.

Nothing like debugging via atomics.

ModernMech|7 years ago

I identify with this bit from the paper

> To offer a third analogy, a folk deﬁnition of insanity is to do the same thing over and over again and to expect the results to be different. By this deﬁnition, we in fact require that programmers of multithreaded systems be insane. Were they sane, they could not understand their programs.

I actually had this exact notion when implementing pthreads for a course. I noted to myself "Gee, I keep doing the same thing and every time I get a different result... I must be insane according to the definition"