top | item 10230080

(no title)

tailhook | 10 years ago

Well, I believe that it's almost impossible to make rust threads lightweight because every green thread needs a stack anyway. This may be fixed with some stuff like `async/await`.

But let's talk about what's wrong with threads:

1. Timeout handling is ugly: you need to account a timeout in each read and write operation. At least timeout handling makes coroutine/threaded code no better than state machine code. But actually in state machine approach I can set a deadline once and update it only when changed (note to myself: should add such an example to documentation). In Python, it's usually fixed by making another coroutine with sleep and throw an exception to the one doing I/O. It works well, but Rust will never get exceptions (I hope)

2. When you make a server that receive a request, looks in DB then responds, there is an incentive to own (create or acquire from the pool) a DB connection by each thread. This is a sad trick. The better thing when the DB connection is handled by a coroutine on its own. Because in the latter case you may pipeline multiple requests over a connection, monitor if the connection is still alive, reconnect to DB while no requests are active, or the contrary, shut down idle connections. By pipelining you may keep less number of connections to DB so make the load to the database a little bit lower. When I'm talking about DB in this paragraph, of course, I mean everything for which this application is a client. Sure you can do that in threaded code too, but it's much harder to get right. You need two threads per connection (because one reads network and the other looks at the queue and does write), you need to synchronize both sides, connection cleanup code is complex, there is more than one level of timeouts now, so on.

3. You need to avoid deadlocks. Rust takes care to avoid data races, but deadlocks are possible. And they are not always simple or reproducible, so you will have a hard time debugging them. In the single-threaded async code, you are the only user. Even if you have an async thread per processor, you are more likely to own resources instead of locking on them. You may duplicate many things for every thread. You can have more coarse-grained locks, so never hold two of them. But it's almost impossible to write lock-free threaded server.

All of the issues above are neither fixed with async/await nor with any M:N or 1:1 threading approaches.

discuss

jerf|10 years ago

I've been writing in this model for nearly ten years now. In practice, what you cite as problems aren't.

1: In either approach, somewhere in your event loop you're setting yourself a timeout to fire. Haskell & Erlang do use exceptions, but Go does not, it simply makes this a first-class concern of the core event loop. This is only a problem in languages where the threading was bolted on after-the-fact. Which is a lot of languages, which matter because they have a lot of code. I don't mean to dismiss those real problems. But it's not a fundamental problem, only accidental.

2. In practice, this is not a problem I ever worry about. You get a DB library, it provides pools, unless you're talking to a very, very fast DB (like, memcached on localhost fast) this is one of those cases where IO really does dominate any minor price of thread scheduling.

3. This has been solved for a long time. Go has the nicest little catch phrase with "share memory by communicating instead of communicating by sharing memory", but each of Haskell, Erlang, and Go have their own quite distinct solutions to their problems, and in practice, all of them work. There's other solutions I merely haven't used, but I hear Clojure works, too. (Perhaps arguably a subset of the several approaches Haskell can use. Haskell kind of supports darned near everything, and you can use it all at once.)

This is part of why I write this sort of thing... at its usual glacial pace (despite how much we like to flatter ourselves that we move quickly), the programming community is finally getting around to being really seriously pissed off about how bad threading was in the 1990s. Good. We should be. It sucked. Let us never forget that. But what has not been so well noticed is that the problems with threading have basically been fixed, and in production for a long time now (i.e., not just in theory, but shipping systems; go ask Erlang how long it's been around). You just have to go use the solutions. Don't mistake debates about the minutia of 1:1 OS threading vs. M:N threading and which is single-digit percent points faster than the other for thinking that threading doesn't work.

Lest I sound too pollyannaish about what is still a hard domain, the way I like to put this is that threading has moved from an exponentially complex problem to a polynomially complex problem. (And Rust is leading the way on making even the polynomial have a small number in the exponent.) There's still a certain amount of complexity in making a threaded program go zoom, and it does require some adjustments to how you program, it's not "free", but rather than requiring wizards, it merely requires competent programmers who take a bit of care and use good tools and best practices now.

dboreham|10 years ago

Somewhat depressing that the earliest "heated discussion" I remember about this was nearly 20 years ago. The outcome of that discussion (with Alan Freier about the M:N and cooperative threading implemented in NSPR at that time) was "if only we could just make threads work". Seems like progress has been pretty slow toward that goal, although Go has picked up the pace recently.

devit|10 years ago

"Communicating" doesn't solve deadlocks.

Thread 1 sends message A to thread 2 and waits for a response.

As part of processing message A, thread 2 sends message B to thread 1 and waits for a response... forever, since thread 1 is blocked waiting for thread 2...