Reality Check for Cloudflare Wasm Workers and Rust

dgreensp|4 years ago

I think the one-sentence version of this is that Workers are meant for small, undemanding tasks (for example, they have tight memory limits and don’t have great performance), so using them to do “serious number crunching” at the edge, which is the advertised use case, seems questionable.

I think the blurb about the downsides of Wasm is just too generic, it’s a sort of “why Wasm isn’t preferable to JS in all cases” for the uninitiated. It may not be meant to imply that number crunching is the use case.

kentonv|4 years ago

> I think the one-sentence version of this is that Workers are meant for small, undemanding tasks

Not at all! We're building a platform on which you can build your entire app. What you say may have been the case four years ago when Workers launched, but since then we've added Durable Objects, Cron triggers, much longer time limits, etc. We very much believe Workers can be a stand-alone alternative to other cloud providers.

> for example, they have tight memory limits and don’t have great performance

This isn't true.

"Performance" is a vague term, you need to clarify the use case and what you're measuring. But, I can't think of what you could mean by "don't have great performance", that seems to imply that they execute code slower or something, which just isn't true at all. In many cases, Workers perform much better than you could achieve with any other platform, due to the ability to spread work and data across the network and move it close to where it's needed.

The "memory limit" on a single worker instance is 128MB, but Cloudflare runs many instances of the worker around the world, so across the network you're really getting many gigabytes of memory. By building a distributed system based on Durable Objects, you can harness the memory of many instances to use on a single task. Workers definitely biases towards distributing load across the network rather than running a single fat instance of your server, but that just means Workers makes it easy to build apps that scale to much higher.

What this article is highlighting is that Wasm is still an immature technology. That is, unfortunately, just a fact. There's still work to be done, and progress is being made, but it's still early. The code footprint issue (because every app must bring along its own language runtime) is the biggest blocker. We hope to see that solved with dynamic linking.

But, Workers isn't primarily based on Wasm. The vast majority of Workers are written in JavaScript, where these issues don't exist. Workers runs JavaScript just as fast as any Node.js server, and runs it closer to the client resulting in better latency.

csomar|4 years ago

> I think the one-sentence version of this is that Workers are meant for small, undemanding tasks (for example, they have tight memory limits and don’t have great performance)

That could be most web apps functionalities. Things like registration, authorization/authentication, sending emails, store/retrieve data, etc...

> so using them to do “serious number crunching” at the edge, which is the advertised use case, seems questionable.

Cloudflare workers don't run in the background. They block the HTTP request. For serious computation, Cloudflare should offer background workers that can run for extended periods of time. [1]

1: This could be tricked by triggering an async request, but there is no push API to get notify the "App" of the result.

brundolf|4 years ago

I for one hadn't thought about the cost of shipping your own standard library with every bundle, so it was informative for me

brundolf|4 years ago

> I guess I’ll stick with my error prone Javascript Workers or, more likely, spend an afternoon migrating to a minimal Typescript setup.

If the OP wants a zero-config typescript experience (assuming Deno isn't available on Cloudflare workers), I can't recommend esbuild enough

comagoosie|4 years ago

I think this is an excellent suggestion (I'm OP / author), and one one can add just a dash to this for typechecking. Minimal setups are appreciated, especially when one has many small projects.

wtetzner|4 years ago

There's also js_of_ocaml if a good type system is desired.

dafelst|4 years ago

Great overview OP, and it's nice to see a kind of "in-between" scenario tested, i.e. not a super fast web request or transformation, rather something more akin to a lightweight batch job. It may not be quite a "recommended" use case but it is always interesting (for me at least) to see how these sorts of services' capabilitied can be pushed and or (gently) abused. The memory and code size limitations do seem very restrictive right now, which is a shame though.

Seeing WASM evolve as the new sandboxed runtime target dejour is a super interesting and I love that it is bringing more variety of very powerful but traditionally backend or systems languages to the web.

up6w6|4 years ago

iirc the compilation time of wasm in Cloudflare Workers is very problematic[1] and right now it contradicts their idea of running low latency fast scripts, does anyone know if anything has changed ?

https://community.cloudflare.com/t/fixed-cloudflare-workers-...

kentonv|4 years ago

Yes, it has changed, which is why the thread you linked has "[FIXED]" in the title. Details can be found later on in the thread.

jfrunyon|4 years ago

I believe a zip file could be streamed - most of the file metadata is duplicated between both the 'central directory record' trailer and a header in front of each file. In other words, the first thing in the zip file is a header that you can use to extract the first file, followed by that file, followed by the next file's header...

stavros|4 years ago

You can, yes. You can even download the header, open the zip file, choose which files to extract, and only download those. This is possible with HTTP today, and has been for decades.

wibagusto|4 years ago

Correct me if I’m wrong but the memory copying issue is not an issue if you pass an array buffer into WASM from JavaScript. In that scenario there’s no data copying. E.g. similar to how you’d pass the canvas data to WASM for direct manipulation.

azakai|4 years ago

No, this is an issue currently, both for network data and canvas data.

All wasm instructions can do is read and write from the wasm Memory that the wasm is initialized with. They can't even refer to separate things like a new ArrayBuffer from JS. So you do need to copy.

Newer wasm additions like reference types allow an ArrayBuffer to be referred to inside wasm, but only as an opaque reference to the entire thing (an externref). There is still no ability to actually read and write from it inside wasm.

The solution to this is BYOB ("bring your own buffer") APIs, which JS is adding. They are experimental atm though. Here is the relevant one here:

https://developer.mozilla.org/en-US/docs/Web/API/ReadableStr...

Note how you pass in a view to the JS API. That can be a view into the wasm memory, letting the browser directly write data into there, and then wasm can operate on it immediately.

Matthias247|4 years ago

So how do you fill that arraybuffer from Javascript datatypes (e.g. strings which describe the request)? The answer is "by copying the relevant data" - which is exactly what the bridging code between JS and WASM does. You can only avoid that if the source data is already plain byte arrays and not javascript objects.

59 comments