top | item 26539102

Piscina – The Node.js Worker Pool

75 points| AquiGorka | 5 years ago |github.com

47 comments

order
[+] chmod775|5 years ago|reply
Here's my issue with this kind of "magic" API design:

It's not clear when or how serialization happens.

Supposedly at some point the message I am sending to the worker is serialized, but this is not made clear to the user and it's not made clear when this happens.

A well-designed library would synchronously serialize the given object the moment it is passed in or let the user explicitly handle serialization. But I don't think that's what is happening here.

It appears messages are serialized only eventually and when they are finally sent off to a worker.

If you accidentally pass mutable state in here, you're in for a really confusing and fun debugging session. Likely it'll be a production-only bug too, because during development and testing you're unlikely to have the kind of message volume required to run into some modified-before-sent condition.

CTRL-F "mutable" and CTRL-F "serialize" gives no results, so I don't think the designers thought of this or thought to warn users.

[+] nimbix|5 years ago|reply
I definitely ran into some serialization issues when trying Piscina a while ago. Returning generated images from the workers would keep getting slower and slower until the overhead of data exchange was 20x the time it took to generate it.

I discovered through trial and error that returning image bytes directly would incur this ever increasing overhead, btu creating a binary buffer and returning that would not. It was just a pet weekend project, so I never did discover what was causing the issue.

[+] cjdell|5 years ago|reply
What would be nice is a way of preventing the same job running multiple times concurrently. Like if I start a job and a job with the same parameters was already started milliseconds ago then it automatically awaits the already running job rather than starting another.
[+] sbarre|5 years ago|reply
You could build a reasonably lightweight supervisor pattern that uses a parameter-derived hash for comparison to handle this kind of situation in your application too.

Might be easier and more flexible than asking the library to do it?

[+] lyjackal|5 years ago|reply
Interesting that it's written in typescript and the readme doesn't mention using it with typescript. I ran into annoyances with this recently trying out nodes worker threads in a typescript project. I was running with ts-node, but the worker thread didn't know how to load a typescript file. There are some workarounds but they're not elegant.
[+] freeqaz|5 years ago|reply
Does anybody use this or anything similar? If so, what problems are you solving?
[+] monstermachine|5 years ago|reply
I use workers in deno to evaluate user facing code which may take a lot of time to finish and need sandboxing. Another place it's used in where I want to keep context and reuse it for subsequent code execution for each user.

In deno, you can restrict what the code inside worker can do by passing a map of allowed permissions. I have built a simple privilege system on top of this to allow users different access level.

This is cheaper and faster than spinning up container.

[+] Etheryte|5 years ago|reply
The worker API in Javascript is quite a pain to use, but needless to say multithreading is invaluable in many contexts, both in the browser and in Node. I haven't used this library but it seems to solve a similar problem to other ones in the same space — make writing multithreaded code sane, allowing you to avoid writing a bunch of repetitive boilerplate.
[+] rektide|5 years ago|reply
Knex, the SQL query builder, uses Tarn.js[1] for connection pooling to the db.

I've been using Tarn a bunch at work recently. We're doing some batch jobs, and I'm queuing work at each stage in Tarn.js pools. I created my own enqueue function that waits until the pool is less than a high-water mark in size before enqueueing. Then the pool has however many workers running.

Neither of these are off-thread pools. But they help a lot of for managing multiple async operations.

[1] https://github.com/vincit/tarn.js/

[+] ddoolin|5 years ago|reply
We use worker threads directly to process large unorganized (for the browser) datasets and do some deductions before it hits the store.

We also have a worker thread blocked on a redis channel that acts as a queue.

[+] Skhalar|5 years ago|reply
This is useful for processing large chunks of data like audio files (look at Superpowered sdk) but breaking em down or when processing multiple files.
[+] barefeg|5 years ago|reply
I’m guessing the fact that node is single threaded
[+] maxrev17|5 years ago|reply
Might be quite nice for keeping tokens refreshed?
[+] ericlewis|5 years ago|reply
Neat! I’m friends with the creator of this and teased him a bit about the name (so many of these projects have weird names now a day)

The reason is: don’t wanna be boring from what I could glean.

[+] kevinstubbs|5 years ago|reply
"piscina" in Italian means "pool". A library for worker pools simply named "pool" in Italian doesn't seem that strange :)
[+] revskill|5 years ago|reply
Could i use it on a serverless platform ?
[+] timmit|5 years ago|reply
I assume u can, but how it performs, it really dependents on the virtual CPU of the, platform.

I did something similar

https://github.com/tim-hub/pambdajs

but I haven't done the comparison on aws lambda yet

[+] 29athrowaway|5 years ago|reply
Common mistake in stream code:

    -- .on('end')
    ++ .once('end')
[+] hfktk4nrn|5 years ago|reply
Is it just me, or do I see a trend in naming projects using romance language words (Italian/Spanish/France)?

Does it sound more exotic? Are these words less crowded?

[+] dang|5 years ago|reply
Could you please stop creating accounts for every few comments you post? We ban accounts that do that. This is in the site guidelines: https://news.ycombinator.com/newsguidelines.html.

You needn't use your real name, of course, but for HN to be a community, users need some identity for other users to relate to. Otherwise we may as well have no usernames and no community, and that would be a different kind of forum. https://hn.algolia.com/?query=community%20identity%20by:dang...

[+] G4BB3R|5 years ago|reply
I think it's because most of english names are already taken.
[+] andreynering|5 years ago|reply
I'm curious on why you call these languages "romance" languages.

"Piscina" is a Portuguese word for "pool", by the way.