Show HN: Bringing multithreading to Python's async event loop
82 points| nbsande | 1 year ago |github.com
While this was initially built with enhancing CPU utilization for FastAPI servers in mind, the approach can be used with more general async programs too.
If you’re interested in diving deeper into the details, I’ve written a blog post about it here: https://www.neilbotelho.com/blog/multithreaded-async.html
noident|1 year ago
There is built-in support for this. Take a look at loop.run_in_executor. You can await something scheduled in a separate Thread/ProcessPoolExecutor.
Granted, this is different than making the async library end-to-end multi-threaded as you seem to be trying to do, but it does seem worth mentioning in this context. You _can_ have async and multiple threads at the same time!
nbsande|1 year ago
game_the0ry|1 year ago
When I am trying to solve a technical problem, the problem is going to dictate my choice of tooling.
If I am doing some fast scripting or I need to write some glue code, python is my go-to. But if I have a need for resource efficiency, multi threading, non-blocking async i/o, and/or hi performance, I would not consider python - I would probably use JVM over the best python option.
Don't get me wrong, I think its a worthwhile effort to explore this effort, and I certainly do not think its a wasted effort (quite the opposite, this gets my up vote) I just don't think I would ever use it if I had use case for perf and resource efficiency.
Myrmornis|1 year ago
mywittyname|1 year ago
I've been using ThreadPoolExecutors in Python for a while now. They seem to work pretty well for my use cases. Granted, my use cases don't require things like shared memory segments; I use as_* functions under concurrent.futures to recombine the data as needed. Honestly, I prefer the futures functions as I don't need to think about deadlocks.
trashtester|1 year ago
And in many teams, just having to worry about python makes it easier to keep team members productive if they're not expected to handle several different languages productively.
gloryjulio|1 year ago
An example is facebook's php to hack compiler
nbsande|1 year ago
m11a|1 year ago
My understanding was that tasks in an event loop should yield after they dispatch IO tasks, which means the event loop should be CPU-bound right? If so, multithreading should not help much in theory?
spiffytech|1 year ago
I've seen code that spends disproportionate CPU time spent on e.g., JSON (de)serializing large objects, or converting Postgres result sets into native data structures, but sometimes it's just plain ol' business logic. And with enough traffic, any app gets too busy for one core.
Single-threaded langs get around this by deploying multiple copies of the app on each server to use up the cores. But that's less efficient than a single, parallel runtime, and eliminates some architectural options.
limaoscarjuliet|1 year ago
bastawhiz|1 year ago
Which is to say, why even bother with async if you want your code to be fully threaded? Async is an abstraction designed specifically to address the case where you're dealing with blocking IO on a single thread. If you're fully threaded, the problems async addresses don't exist anymore. So why bother?
quotemstr|1 year ago
btown|1 year ago
My startup has been using it in production for years. It excels at I/O bound workflows where you have highly concurrent real-time usage of slow/unpredictable partner APIs. You just write normal (non-async) Python code and the patched system internals create yields to the event loop whenever you’d be waiting for I/O, giving you essentially unlimited concurrency (as long as all pending requests and their context fit in RAM).
https://github.com/gfmio/asyncio-gevent does exist to let you use asyncio code in a gevent context, but it’s far less battle-tested and we’ve avoided using it so far.
drowsspa|1 year ago
nbsande|1 year ago