Run JavaScript on Cloudflare with Workers

[+] kentonv|8 years ago|reply

Hi folks! This is my project at Cloudflare -- which, entirely by coincidence, I joined exactly one year ago today: https://news.ycombinator.com/item?id=13860027

Happy to answer any questions.

[+] sitepodmatt|8 years ago|reply

This is awesome.

I guess the uses are endless. Debug some webhook endpoints by duplicating the request to a private requestbin - granted this isn't the right level to do this but sometimes we're working with legacy setups. Add missing json fields that an API broke for some client - again not ideally level to handle but sometimes we bandaid it. Send an API auth spammer gigs of random garbage because chances are their shitty client doesn't have any limitations - granted after 14.9 seconds you'd need to subrequest to your own garbage generating origin - or a 10gb garbage file somewhere

My uses would mainly be live debugging where we don't have a perfect stack (i.e. nearly all the time).

[+] netcraft|8 years ago|reply

The FAQ says that these are still in beta and not production ready - is that still true or has that page not been updated with this announcement?

Also, it says 50ms of cpu time, 15 seconds of real time as limits. When developing, are there easy ways to get measurements of how long things are taking? I can imagine 50ms on my MBP may not be the same as in production - either faster or slower - but I wouldnt want to get to production to find out.

https://developers.cloudflare.com/workers/faq/

[+] geofft|8 years ago|reply

Hey, congratulations! Definitely going to play with this this weekend.

What are my options if I build something complicated using these workers and for whatever reason I need to stop using Cloudflare? Or if I want to write something open-source that people can deploy either on Cloudflare or on-premise? Is there a reasonable way to emulate this functionality (possibly with less sandboxing) on a self-hosted web server?

Also BTW you should update https://cloudflareworkers.com to note that it's live :)

[+] dpwm|8 years ago|reply

Some immediate uses that spring to mind involve storing data as strings. An example is rendering templates from JSON or a more resource-friendly serialization protocol. What are the limits on the file size of the workers?

Also, regarding resource usage, is this memory usage exposed at all to the worker via an API? I'm thinking there may be applications where it would be useful to cache fetched resources but without hitting the memory ceiling and having the requests die with the worker.

[+] sitepodmatt|8 years ago|reply

One more question. Whats to stop this being used for amplification DDOS? Register a stolen CC then launch x100 parrell http(s) subrequest for each incoming request to your intended target, possibly adding large random payload (well as much as you can generate in 50ms minus time to setup 100 subrequests with fetch API)

[+] sitepodmatt|8 years ago|reply

Any private gitbucket/bitbucket integration on the card, like most CI SaaS do, gets an oauth2 token for the desire repositories, setups the notification webhooks, deploys on push (of nominated branch)?

[+] sitepodmatt|8 years ago|reply

How does a cold start perform, I'm guessing with sandboxing of v8 vs a whole new container, and deserialization of a single script into a free sandbox v8 slot (perhaps they store the AST or preJITed code for even faster start? - not familar with v8 internals) it is much faster than cold starting nodejs on AWS lambda?

[+] nathantotten|8 years ago|reply

So no free tier? That’s kind of a bummer, but I guess it’s time I start paying Cloudflare for something. :)

[+] thepumpkin1979|8 years ago|reply

Congrats, this is really great. Question, since it’s not based on Node.js there is no way to leverage on a regular package manager, how can I create reusable code between edge workers? How can I make an open source snippet without copy and paste?

[+] devwastaken|8 years ago|reply

Are the server edge-points true to UTC Time? As in, are they synced with something like NTP? There is a severe lack of accurate time coming from edge-points in systems such as AWS Lambda, and no one appears to be confirming if they do or don't make sure their time is correct.

It would be incredibly useful if there was an edge-point service that could return linux epoch time in milliseconds (Thats accuraet to withing a 1ms of UTC time). I've been working on a Live broadcasting syncing system and there really wouldn't be anyone in a better position than CDN's with lambda-like functionality.

[+] btb|8 years ago|reply

Will every request to a worker-enabled site, pass through the worker v8 engine and charged at the going rate, even requests for static ressources like favicons, or jpgs etc? Or is there some way to limit the worker engine request matching to a specific area of your site(like /service)?

[+] mankash666|8 years ago|reply

Would it be possible to invoke native code like in other FaaS runtimes: https://github.com/mankash/nativeGcpFunction?files=1

[+] bad_user|8 years ago|reply

> Due to the immense amount of work that has gone into optimizing V8, it outperforms just about any popular server programming language with the possible exceptions of C/C++, Rust, and Go.

Odd statement and it’s not true.

I work with Node.js a lot, which is using V8 and have tested a lot of code cross compiled to both JavaScript and the JVM.

As a matter of fact the JVM beats the shit out of V8 in everything but startup time.

And this is not an educated guess, I have the same code, some cross-compiled via a compiler that can target both (plenty of such compilers these days, including Scala, Clojure and Kotlin) and some code hand optimized for each specific platform and the difference is huge in both cases. And let’s be clear, this is not code that handles numeric calculations, for which JS would be at a big disadvantage.

Imagine that I’m not running the same unit tests, for JS I have to do much less interations in property based testing, because Travis-ci chokes on the JS tests. So it’s a constant pain that I feel weekly.

And actually anybody that ever worked with both can attest to that. The performance of V8 is rather poor, except for startup time where the situation is reversed, V8 being amongst the best and the JVM amongst the worst if startup time matters.

[+] kentonv|8 years ago|reply

You're right, Java should have been included along with the other languages I mentioned. As a strongly-typed, compiled language, it should indeed beat V8 handily. I had intended to say that V8 outperforms other dynamically-typed languages like PHP, Python, Ruby, etc.

But because startup time and RAM usage are so important to our use case, Java has never been in the running as a plausible implementation choice, so to be honest I sort of forgot about it. :/

[+] benjaminjackman|8 years ago|reply

It's funny I was writing ScalaJS for a while and one idea I had was to use it for jobs where startup time really mattered and then use the ScalaJVM for everything else. So I could kind of get the best of both worlds. It turned out we were able to live with the JVM startup time without any problems for needs so I never pursued it further.

I think the big things coming down the pipe for the performance of each is: Value Types for java. A lot of what is needed can be done with sun.misc.unsafe already but it's an awkward way to have to program, I haven't kept up with it but hopefully it supports being dropped right onto memory mapped i/o for stuff like CapNProto style parsing.

On the v8 side, I think wasm could be a massive game changer and basically eat everything in concert with javascript. I wonder how it's performance is going to stack up to true native / a warmed up jvm (that has done all the fun inlining optimizations that can make it so fast).

[+] graystevens|8 years ago|reply

$0.50/million requests, but it’s a minimum of $5/month (giving you 10million requests essentially.)

Not a criticism, this it looks like an excellent product for those who can benefit from it, just calling it out for others that read the comments here before the original content.

[+] jakozaur|8 years ago|reply

Looks like a little bit cheaper than AWS Lambda@Edge which is $0.6/mln, but more than regular AWS Lambda $0.2/mln. On Lambda you pay extra for resources, but you can get more RAM or CPU there (e.g. running Chrome Headless is an option there).

On the other hand CloudFlare Workers looks more distributed, but suitable just for 50ms CPU time, 15s wall time and 128 MB. This is enough for redirects, A/B testing, but often not enough for writing serverless applications or any kind of rendering.

I wonder whether CloudFlare wants to get into serverless business and this is first iteration or if it's just a CDN which is more customisable by allowing code to run there.

[+] jgrahamc|8 years ago|reply

And they typically run in under 1ms and global deploys take less than 30s and it's native V8.

[+] neuland|8 years ago|reply

Are Cloudflare Workers implemented as an NGINX module like OpenResty?

[+] stevebmark|8 years ago|reply

Wasn't running code on edge nodes how Cloudflare man in the middle attacked millions of encrypted websites for a year?

[+] siscia|8 years ago|reply

What I find missing here and in aws lambda or google function is the notion of "state".

I believe it would really be a game changer if we could open and maintain an open connection with a database.

[+] ryanworl|8 years ago|reply

In Lambda and Cloud Functions you can open a connection with a database (or anything else) and it will stay alive across invocations of the same underlying container.

Not sure how Cloudflare workers behaves here, but from their docs they recommend global variables a way to persist state, so perhaps an outbound TCP connection will stay alive across invocations of the same V8 "process".

[+] kentonv|8 years ago|reply

Indeed, we want to provide storage, but easy-to-use storage that can scale to 100+ (or 1000+) nodes is tricky. We're working on it. :)

PS. If you're a distributed storage expert and want to work on this, we're hiring!

[+] ggambetta|8 years ago|reply

FWIW kentonv posting in this thread, who did this project, is the guy who invented protocol buffers :o

[+] ocdtrekkie|8 years ago|reply

Technically, Kenton made Protocol Buffers version 2, and open sourced it, I believe. Kenton did, however, make Cap'n Proto, which builds on what he learned from doing Protocol Buffers. And he also created Sandstorm.io, a self-hosting platform I am quite fond of. :)

[+] neals|8 years ago|reply

Not really into this space. What would be some usecases for this? Call an api, do a thing? Or more like, process some data and post the results?

[+] pbowyer|8 years ago|reply

A great place to start is what companies like the Financial Times have been doing with Fastly. By pushing authentication to the edge [1] you can cache more. If you're receiving a lot of data in that you need to log and analyse later, you can use the Workers to redirect straight to your logging service - no servers involved [2].

1. https://www.fastly.com/blog/how-solve-anything-vcl-part-3-au...

2. Sorry, couldn't find the link. IIRC Fastly does this with its in-built logging feature.

[+] zackbloom|8 years ago|reply

There are some that people already think about doing on the edge, like complex caching rules, routing based on cookies, and edge side includes.

Then there are things people are just starting to think about, like doing a/b testing by serving different variants from the edge and building their API gateway into the edge.

Finally there are things people will only start dreaming of now that the tech is available, like filtering the massive stream of data coming in from IoT at the edge, or powering interactive experiences that require compute which individual machines don't have, but speed which centralization can't provide.

[+] unknown|8 years ago|reply

[deleted]

[+] emj|8 years ago|reply

Exciting times! Seems it's all about making a product that sells the features of serverless in the right way. Technically I would like websockets on these platforms, but I don't know how to sell that as a feature.

These numbers 50 ms of compute time and 15 s idle are interesting in the serverless space. Now I'm waiting for sane performance suites to figure out what suites you, I'm guessin this solution will kill in these strange latency test for AWS lambda from the other day: https://news.ycombinator.com/item?id=16542286

[+] kentonv|8 years ago|reply

> Technically I would like websockets on these platforms

FWIW that's something I'm working on. It's tricky because it's not actually in the Service Workers standard. Currently, we support WebSocket pass-through (so if you do something like rewrite a request and return the response, and it happens to be a WebSocket handshake, it will "just work"), but haven't yet added support for terminating a WebSocket directly in a Worker (either as client or server).

[+] jkarneges|8 years ago|reply

Fanout (https://fanout.io) is useful for handling raw WebSockets from a FaaS backend.

Getting it to work with Cloudflare Workers is a little more involved since our Node libraries don't run in their Service Worker environment, but if you implement the negotiations manually it does work.

[+] HugoDaniel|8 years ago|reply

Do those workers have to complete a captcha ?

[+] jasongill|8 years ago|reply

I don't think you read the article - this is your own JavaScript that runs on their edge severs, not humans doing work

[+] jeswin|8 years ago|reply

Will you be able to share how you sandbox these scripts?

[+] kentonv|8 years ago|reply

We have multiple layers of sandboxing. To start, each Worker runs in a separate V8 isolate (which is actually a stronger separation than Chrome uses to separate an iframe from a parent frame, by default). We also have an extremely tight seccomp filter, and a long list of other measures.

We made an intentional decision early on to avoid providing any precise timers in Workers -- even Date.now() only returns the time of last I/O (so it doesn't advance during a tight loop). This proved to be a really good idea when Spectre hit. (But we also shipped V8's Spectre mitigations pretty much immediately when they appeared in git -- well before they landed in Chrome.)

[+] arpit|8 years ago|reply

How does this compare with running AWS Lambda's on CloudFront?

https://docs.aws.amazon.com/AmazonCloudFront/latest/Develope...

[+] tombowditch|8 years ago|reply

Looks very nice! Is the minimum cost ($5/m) per site or per account?

[+] ryanworl|8 years ago|reply

Can a Cloudflare app install a worker into a Cloudflare customer’s zone?

[+] kentonv|8 years ago|reply

Not yet, but with Workers launched this is now my top priority, and something we're all very excited about.

[+] zaarn|8 years ago|reply

Now this is indeed quite interesting.

Though 0.50 per million is only true if you have more than 10 million requests per month, otherwise the price will be higher since there is 5$ minimum.

[+] hashseed|8 years ago|reply

In what case should this be preferred over plain old Service Workers running on the user's browser? Latter is even lower latency and for free.

[+] jgrahamc|8 years ago|reply

1. It's easy to maintain/update the code because it is pushed once to Cloudflare and you don't have to worry about browser caching effects on JavaScript delivered to the browser.

2. The performance of the code will be much higher than in the browser because of the server resources available and also because and subrequests will happen across Cloudflare's fast/reliable links and not whatever the end user is connected to.

3. The end user has control over what JavaScript is executed and might use a tool like Disconnect to block external scripts preventing the code from running at all.

4. Security: you can include things like API keys.

5. Script starts executing earlier.

6. Conserves bandwidth/battery life of mobile users.

[+] kentonv|8 years ago|reply

A few cases off the top of my head:

- When you need to work with older browsers or non-browser clients (e.g. API clients!) that don't support service workers.

- When it would be a security problem if the user can bypass or interfere with the worker (e.g. you can implement authentication in a CF Worker).

- When you specifically want to optimize your use of the shared HTTP cache at the Cloudflare PoP.

- When startup time of a service worker would be problematic.

- When CORS would prevent you from making the requests you need from the browser side. (CORS doesn't apply to CF Workers, since the things CORS needs to protect against are inherent to the client-side environment.)

[+] mrkurt|8 years ago|reply

It's lower latency once you've delivered the javascript.

There's a lot of interesting stuff you can do with edge applications. Image optimization/resizing, content rewrites, pre-rendering, API gateway, etc, etc, etc. These are all things you want to do once for many visitors.

There's a lot you _can't_ do in a browser because browsers are untrusted. Edge applications can run with a different level of trust.

[+] holtalanm|8 years ago|reply

this looks like something that could be used to host an entire web app (sans database). Pretty cool!

[+] thefounder|8 years ago|reply

Is it possible to just change/rewrite the request origin like on AWS Edge?

[+] zackbloom|8 years ago|reply

Yes it is! You can make arbitrary requests in your Worker to anywhere on the internet you like, and return any response you like.

185 comments