I may be naive in asking this, but what leads someone to building high perf data tools in JS? JS doesn't seem to me like it would be the tool of choice for such things
I have a SaaS project where the backend is in JS. I also have some data processing to do with large file (several TB). Doing it is in JS is more convenient as I can reuse code from the backend, and it is also the language I know best.
Performance-wise, I get about half the throughput I had with the same processsing done it rust, which doesn't change anything for my use-case.
However that's not really relevant to the context of the post as I'm using node.js streams which are both saner and fast. I'm guessing that the post is relevant to people using server-side runtimes that only implement web streams.
Browsers are now able to stream files from disk so you can create a high performance tool that'll read locally, do [x] with it and present the results, all without any network overhead.
you pass in your own Uint8Array allocation (in go terms, []byte), the reader fills at most the entire thing, and returns (bytes filled, error). it’s a fully pull stream API with one method at its core. now, the api gets to be that simple because it’s always sync, and blocks until the reader can fill data into the buffer or returns an error indicating no data available right now.
go has a TeeReader with no buffering - it too just blocks until it can write to the forked stream.
we can’t do the same api in JS, because go gets to insert `await` wherever it wants with its coroutine/goroutine runtime. but we can dream of such simplicity combined with zero allocation performance.
slowcache|2 days ago
I may be naive in asking this, but what leads someone to building high perf data tools in JS? JS doesn't seem to me like it would be the tool of choice for such things
moron4hire|2 days ago
n_e|2 days ago
Performance-wise, I get about half the throughput I had with the same processsing done it rust, which doesn't change anything for my use-case.
However that's not really relevant to the context of the post as I'm using node.js streams which are both saner and fast. I'm guessing that the post is relevant to people using server-side runtimes that only implement web streams.
afavour|2 days ago
thadt|2 days ago
jitl|2 days ago
right now when i need to wrangle bytes, i switch languages to Golang. it’s easy gc language, and all its IO is built around BYOB api:
interface Reader { read(b: Uint8Array): [number, Error?] }
you pass in your own Uint8Array allocation (in go terms, []byte), the reader fills at most the entire thing, and returns (bytes filled, error). it’s a fully pull stream API with one method at its core. now, the api gets to be that simple because it’s always sync, and blocks until the reader can fill data into the buffer or returns an error indicating no data available right now.
go has a TeeReader with no buffering - it too just blocks until it can write to the forked stream.
https://pkg.go.dev/io#TeeReader
we can’t do the same api in JS, because go gets to insert `await` wherever it wants with its coroutine/goroutine runtime. but we can dream of such simplicity combined with zero allocation performance.