We use the WASM build of DuckDB quite extensively at Count (https://count.co - 2-3m queries per month). There are a couple of bugs we've noticed, but given that it's pretty much maintained by a single person seems impressively reliable!
Author here. Thank you all for the comments. I take full responsibility for stupidly using an image for posting the code snippet. Sorry for that! Also, the article was originally posted almost 2 years ago (and "resurrected" with the recent migration to Medium). This is why a fairly old DuckDB version is referenced there. Some of the issues I observed are now gone too.
Obviously, many things have changed since then. We've experimented extensively and moved back and forth with using DuckDB for our internal cloud processing architecture. We eventually settled on just using it for reading the data and then handling everything else in custom workers. Even using TypeScript, we achieved close to 1M events/s per worker overall with very high scalability.
However, our use-case is quite distinct. We use a custom query engine (for sequence processing), which has driven many design decisions.
Overall, I think DuckDB (both vanilla and WASM version) is absolutely phenomenal. It also matured since my original blog post. I believe we'll only see more and more projects using it as their backbone. For example, MotherDuck is doing some amazing things with it (e.g., https://duckdb.org/2023/03/12/duckdb-ui) but there are also many more exciting initiatives.
> [wasm] is executed in a stack-based virtual machine rather than as a native library code.
Wasm's binary format is indeed a stack-based virtual machine, but that is not how it is executed. Optimizing VMs convert it to SSA form, basic blocks, and finally machine code, much the same as clang or gcc compile native library code.
It is true that wasm has some overhead, but that is due to portability and sandboxing, not the stack-based binary format.
> On top of the above, memory available to WASM is limited by the browser (in case of Chrome, the limit is currently set at 4GB per tab).
wasm64 solves this, by allowing 64-bit pointers and a lot more than 4GB of memory.
The feature is already supported in Chrome and Firefox, but not everywhere else yet.
In the past few years there was this blog post[0] that clarified this. It moved the restriction on serving a "disproportionate percentage of pictures, audio files, or other large files" to another part of the TOS dedicated specifically to the CDN part[1] and clarified that, if you're using Cloudflare add-on services Stream, R2 (their S3), or Cloudflare Images, then you won't be at risk of termination.
mcraiha|11 months ago
In this case 25 lines of code is 50 kB of image binary.
Also it cannot be searched via search engine. Nor can it be read with screen reader.
doubled112|11 months ago
Never should I receive a Java exception hundreds of lines long as a cut off JPEG file.
Or a screenshot of a Google Sheet missing the information you’re talking to me about.
goda90|11 months ago
fragmede|10 months ago
unknown|11 months ago
[deleted]
unknown|11 months ago
[deleted]
markerz|11 months ago
drtgh|10 months ago
I mean, if such coloring it's going to be done, it should be done with HTML/CSS.
For the OP article, for screenreaders perhaps you are sugesting people to use the alt attribute or similar.
paulsutter|11 months ago
[deleted]
jasmcole|11 months ago
jillyboel|11 months ago
pmm|10 months ago
Obviously, many things have changed since then. We've experimented extensively and moved back and forth with using DuckDB for our internal cloud processing architecture. We eventually settled on just using it for reading the data and then handling everything else in custom workers. Even using TypeScript, we achieved close to 1M events/s per worker overall with very high scalability. However, our use-case is quite distinct. We use a custom query engine (for sequence processing), which has driven many design decisions.
Overall, I think DuckDB (both vanilla and WASM version) is absolutely phenomenal. It also matured since my original blog post. I believe we'll only see more and more projects using it as their backbone. For example, MotherDuck is doing some amazing things with it (e.g., https://duckdb.org/2023/03/12/duckdb-ui) but there are also many more exciting initiatives.
azakai|11 months ago
Wasm's binary format is indeed a stack-based virtual machine, but that is not how it is executed. Optimizing VMs convert it to SSA form, basic blocks, and finally machine code, much the same as clang or gcc compile native library code.
It is true that wasm has some overhead, but that is due to portability and sandboxing, not the stack-based binary format.
> On top of the above, memory available to WASM is limited by the browser (in case of Chrome, the limit is currently set at 4GB per tab).
wasm64 solves this, by allowing 64-bit pointers and a lot more than 4GB of memory.
The feature is already supported in Chrome and Firefox, but not everywhere else yet.
geokon|11 months ago
I'm still not clear what at its core it's done differently (in a way that couldn't be bolted on to a subset of the JVM)
tobilg|11 months ago
I‘m using a (older) v1.29.1 dev version with https://sql-workbench.com w/o any bigger issues.
__mp|11 months ago
I’m not sure if this goes against the CloudFlare TOS tough (last time I checked they had some provisons against processing images).
judge2020|11 months ago
0: https://blog.cloudflare.com/updated-tos/
1: The restriction still exists at https://www.cloudflare.com/service-specific-terms-applicatio... under "Content Delivery Network (Free, Pro, or Business)".
httgp|11 months ago
bobnamob|11 months ago
tobilg|11 months ago
I‘m running it on AWS Lambda functions with some success.
canadiantim|11 months ago
curtisszmania|11 months ago
[deleted]