top | item 36936866

(no title)

strictfp | 2 years ago

My tip for this is Node.js and some stream processing lib like Highland. You can get ridiculous IO parallelism with a very little code and a nice API.

Python just scales terribly, no matter if you use multi-process or not. Java can get pretty good perf, but you'll need some libs or quite a bit of code to get nonblocking IO sending working well, or you're going to eat huge amounts of resources for moderate returns.

Node really excels at this use case. You can saturate the lines pretty easily.

discuss

order

hughesjj|2 years ago

0_o

Did I miss something? Does nodes/highland have good shared memory semantics these days?

I've always felt the best analogy to python concurrency was (node)js, but I admittedly haven't kept up all that well.

goatlover|2 years ago

Wouldn't Elixir or Go be better for this use case? Node still blocks on compute heavy tasks.

porridgeraisin|2 years ago

I think they mentioned CPU intensive work, which I'm taking to imply that it's more CPU bound than I/O bound. So unless you're suggesting they use Node's web workers implementation for parallelism, the default single threaded async concurrency model probably won't serve them well.

pid-1|2 years ago

Isn't Node single threaded, just like Python?

krylon|2 years ago

Python is technically multithreaded, but the GIL means only one thread can execute interpreter code at a time. If you use libraries written in C/C++, the library code can run in multiple threads simultaneously if they release the GIL.

I vaguely recall Node used to run multiple threads under the hood for disk I/O, but it might use kqueue/epoll these days.