top | item 5817053

(no title)

peripetylabs | 12 years ago

I'm glad I came across this article. I'm learning Python and was given that advice to use multiprocessing rather than threading, but hadn't researched why. Very informative, thanks for sharing.

discuss

order

pjscott|12 years ago

Well hold on now; there are a lot of times when using threading is easier and faster than using multiprocessing. It depends on what you're doing.

Threading creates new OS-level threads, but whenever your code is being run by the bytecode interpreter, Python holds the global interpreter lock. This is released during I/O operations and a lot of the built-in functions, and you can release it in any C or Cython extension code you write. If you're running into Python speed bottlenecks, you can usually get significant speedups with very little effort by moving the bottlenecky code to Cython and maybe adding a few type declarations.

Multiprocessing spawns a pool of worker processes and doles out tasks to them by serializing the data and using local sockets for IPC. This naturally has a lot of overhead, and there's some subtlety with the data serialization. So, be aware of that. The nice part, though, is that you don't have the GIL, which can sometimes speed things up.