Eliminating JavaScript cold starts on AWS Lambda

Oliver is doing awesome work here. A few interesting points:

- Porffor can use typescript types to significantly improve the compilation. It's in many ways more exciting as a TS compiler.

- There's no GC yet, and likely will be a while before it gets any. But you can get very far with no GC, particularly if you are doing something like serving web requests. You can fork a process per request and throw it away each time reclaiming all memory, or have a very simple arena allocator that works at the request level. It would be incredibly performant and not have the overhead of a full GC implementation.

- many of the restrictions that people associate with JS are due to VMs being designed to run untrusted code. If you compile your trusted TS/JS to native you can do many new things, such as use traditional threads, fork, and have proper low level memory access. Separating the concept of TS/JS from the runtime is long overdue.

- using WASM as the IR (intermediate representation) is inspired. It is unlikely that many people would run something compiled with Porffor in a WASM runtime, but the portability it brings is very compelling.

This experiment from Oliver doesn't show that Porffor is ready for production, but it does validate that he is on the right track, and that the ideas he is exploring are correct. That's the imports take away. Give it 12 months and exciting things will be happing.

spankalee|6 months ago

I'm very excited by Porffor too, but a lot of what you've said here isn't correct.

> - Porffor can use typescript types to significantly improve the compilation. It's in many ways more exciting as a TS compiler.

Proffor could use types, but TypeScript's type system is very unsound and doing so could lead to serious bugs and security vulnerabilities. I haven't kept track of what Oliver's doing here lately, but I think the best and still safe thing you could do is compile an optimistic, optimized version of functions (and maybe basic blocks) based on the declared argument types, but you'd still need a type guard to fall back to the general version when the types aren't as expected.

This isn't far from what a multi-tier JIT does, and the JIT has a lot more flexibility to generate functions for the actual observed types, not just the declared types. This can be a big help when the declared types are interfaces, but in an execution you only see specific concrete types.

> or have a very simple arena allocator that works at the request level.

This isn't viable. JS semantics mean that the request handling path can generate objects that are held from outside the request's arena. You can't free them or you'd get use-after-free problems.

> - many of the restrictions that people associate with JS are due to VMs being designed to run untrusted code

This is true to some extent, but most of the restrictions are baked into the language design. JS is a single-threaded non-shared memory language by design. The lack of threads has nothing to do with security. Other sandboxed languages, famously Java, have threads. Apple experimented with multithreaded JS and it hasn't moved forward not because of security but because it breaks JS semantics. Fork is possible in JS already, because it's a VM concept, not a language concept. Low-level memory access would completely break the memory model of JS and open up even trusted code to serious bugs and security vulnerabilities.

> It is unlikely that many people would run something compiled with Porffor in a WASM runtime

Running JS in WASM is actually the thing I'm most excited about from Porffor. There are a more and more WASM runtimes, and JS is handicapped there compared to Rust. Being able to intermix JS, Rust, and Go in a single portable, secure runtime is a killer feature.

gibolt|6 months ago

Based on how much imported libraries are relied upon, it makes sense to treat everything as untrusted. Unless you write every line yourself/in-house, code should be considered untrusted.

I would be curious which attack vectors change or become safe after compiling though.

bastawhiz|6 months ago

> many of the restrictions that people associate with JS are due to VMs being designed to run untrusted code. If you compile your trusted TS/JS to native you can do many new things, such as use traditional threads, fork, and have proper low level memory access. Separating the concept of TS/JS from the runtime is long overdue.

This is just outright wrong. JS limitations come from lots of things:

1. The language has almost zero undefined behavior by design. Code will essentially never behave differently on different platforms.

2. JS has traditional threads in the form of web workers. This interface exists not for untrusted code but because of thread safety. That's a language design, like channels in Go, rather than a sandboxing consideration.

3. Pretty much every non-browser JS runtime has the ability to fork.

4. JS is fully garbage collected, of course you don't get your own memory management. You can use buffers to manage your own memory if you really want to. WASM lets you manage your own memory and it can run "untrusted" code in the browser with the WASM runtime; your example just doesn't hold water. There's no way you could fiddle with the stack or heap in JS without making it not JS.

5. The language comes with thirty years of baggage, and the language spec almost never breaks backwards compatibility.

Ironically Porffor has no IO at the moment, which is present in literally every JS runtime. It really has nothing to do with untrusted code like you're suggesting.

> You can fork a process per request and throw it away each time reclaiming all memory, or have a very simple arena allocator that works at the request level. It would be incredibly performant and not have the overhead of a full GC implementation.

You also must admit that this would make Porffor incompatible with existing runtimes. Code today can modify the global state, and that state can and does persist across requests. It's a common pattern to keep in-memory caches or to lazily initialize libraries. If every request is fully isolated in the future but not now, you can end up with performance cliffs or a system where a series of requests on Node return different results than a series of requests on Porffor.

As for arena allocation, this makes it even less compatible with Node (if not intractable). If means you can't write (in JS) any code that mutates memory that was initialized during startup. If you store a reference to an object in an arena in an object initialized during startup, at the end of the request when the arena is freed you now have a pointer into uninitialized memory.

How do you tell the developer what they can and cannot mutate? You can't, because any existing variable might be a reference to memory initialized during startup. Your function might receive an object as an argument that was initialized during startup or one that's wasn't, and there's no way to know whether it's safe to mutate it.

Long story short, JS must have a garage collector to free memory, or it's not JS.

> It is unlikely that many people would run something compiled with Porffor in a WASM runtime, but the portability it brings is very compelling.

Node (via SEA in v20), bun, and deno all have built in tooling for generating a self-contained binary. Granted, the runtime needs to work for your OS and CPU, but the exact same thing could be said about a WASM runtime.

And of course there are hundreds of mature bundlers that can compile JS into a single file that runs in various runtimes without ever thinking about platform. It's weird to even consider portability of JS as a benefit because JS is already almost maximally portable.

> This experiment from Oliver doesn't show that Porffor is ready for production, but it does validate that he is on the right track, and that the ideas he is exploring are correct.

It validates that the approach to building a compiler is correct, but it says little about whether the project will eventually be usable and good. It's unlikely it'll get faster, because robust JS compatibility will require more edge cases to be handled than it currently does, and as Porffor's own README says, it's still slower than most JITted runtimes. A stable release might not yield much.

unfunco|6 months ago

I really don't struggle that much with cold starts on Node.js/Lambda, and I don't do anything special, my build commands look like:

    esbuild src/handler.ts --bundle --external:@aws-lambda-powertools --external:@aws-sdk --minify --outfile=dist/handler.js --platform=node --sourcemap --target=es2022 --tree-shaking=true

Maybe I'm not doing as much as others in my functions and I tend to stick within the AWS ecosystem, so I save some space and I presume cold-start time by not including the AWS SDK/Powertools in the output, but my functions tend to cold start and complete in ~100ms.

anon7000|6 months ago

Sure, but the approach mentioned here benchmarks with median performance of 16ms. 100ms isn’t great especially if it’s only one part of everything that needs to happen

tkcranny|6 months ago

It’s a TS/JS to wasm to C tool chain, that runs the same JS a dozen times faster than on node. Very cool approach, and lambda cold starts are definitely where it ought to shine.

That said I wonder if it could ever go mainstream – JS is not a trivial language anymore. Matching all of its quirks to the point of being stable seems like a monstrous task. And then Node and all of its APIs are the gorilla in the room too. Even Deno had to acquiesce and replicate those with bugs and all, and it’s just based on V8 too.

tombh|6 months ago

Hopefully Oliver, the creator of the project, is here to better answer, but something worth bearing in mind is that a significant part of conventional JS engines are all JIT tricks. And Porffor is AOT, so can lean on WASM and the C compiler for compilation optimisations. The author did a quick comparison of lines of code for some of the popular JS engines: https://goose.icu/js-engine-sizes

bapak|6 months ago

I seriously dislike this kind of comparisons.

We're faster! (please disregard the fact that we're barely more than a demo)

Everyone knows about 80:20, the slowdowns will come after you start doing everything your competition does.

Look at Biome. We're 15x as fast as ESLint (but disregard the fact that we don't do typeaware linting). Then comes typeaware linting and suddenly they have huge performance issues that kill the project (I'm unable to use Biome 2)

This happens over and over and over. The exceptions are very, very few (Bun is one example)

eliseumds|6 months ago

Idk, I'm having a good experience with Biome 2 in a large codebase. 4s to do a full-check including floating promises, undeclared and cyclic dependencies, and sorting imports. Our ESLint setup used to take almost a minute. The Biome team has been fixing bugs on a daily basis. Version 2.2.0 (released 3 days) ago addressed a common high-CPU-usage bug, try it out.

Edit: it's not 4s anymore, I just measured with the latest version and it takes ~900ms. Insane.

unknown|6 months ago

[deleted]

austin-cheney|6 months ago

I really find it annoying when JavaScript people complain about performance. If you think something is slow then make it faster and open a pull request.

It’s great when those PRs do come, but most of the time there is just empty whining while a developer contributes nothing. This is because most JavaScript developers are deathly afraid to write original software, as that would be reinventing a wheel.

Most JavaScript developers are absolutely incapable of measuring things, so they have no idea when something else is actually faster until else runs the numbers for them. Let’s take your Bun example. Bun is great but Bun is also written in Zig which is faster than C++. Bun claims to be 3x faster at WebSockets than Nodes popular WS package, because Bun can achieve a send rate of 700,000 messages per second (numbers from 5 years ago). Bun is good at measuring things. What they don’t say is that WS is just slow. I wrote my own WebSocket library for Node in TypeScript about 5 years ago that can achieve a send rate of just under 500,000 messages per second. What they also don’t tell you is that WebSockets are 11x faster to send than to receive due to frame header interpretation. I say not to disparage Bun but to show your empty worship is misplaced if you aren’t part of the solution.

ComputerGuru|6 months ago

Lots of negativity in this thread; let me offer a bit of positivity to contrast!

The project homepage is awesome, it's a mix between a throwback to retro documentation (with ascii charts) and a console out of godbolt: https://porffor.dev/

The hangup on lack of GC is probably unnecessarily overwrought, WasmGC is pretty much here and there will be an entire ecosystem of libraries providing JS GC semantics for WASM compilers that this compiler can tap into (actually implementing the backend/runtime GC support is fairly trivial for baseline support).

k9294|6 months ago

It looks similar to GraalVM for Java, right?

It would be amazing if they pull this off. Being able to compile JS to produce minimal binaries for CLIs or just to make slim containers would be nice.

torginus|6 months ago

I don't want to take away from the appreciation of this awesome technical achievement, but in practice I have noticed that:

- Cold starts are kinda rare, sure it sucks that your request takes 600ms, but that means you are the first user. If you would've been served by a container that was just scaled up, you'd have been waiting for much longer

- Microservices and AWS lambda are inherently stateless, and do a ton of things to make themselves useful - get credentials, establish db connections, query configuration endpoints, all of which take time, usually more than your runtime spends booting up.

As much as I like lambdas for their deployment and operational simplicity, if you want the best UX, they have inherent technical limitations which make them the wrong choice.

moqizhengz|6 months ago

Awesome work, but I am genuinely curios about the use cases where the 200ms init time being a problem?

ericmcer|6 months ago

Serverless implementations where lambdas are being used for APIs so +200ms on a single call is not good.

ttoinou|6 months ago

Maybe its a problem when you’re paying 200ms per small request and you have millions of them ?

nemothekid|6 months ago

I have a lambda that runs inference on a model for a feature that is seldomly used.

Lambda is far more economic given the memory requirements, but we recently moved to Rust+Candle to shave off ~300ms on cold starts as the lag could be really jarring.

breve|6 months ago

It'd be good if AWS Lambda provided a wasm runtime option. Cold start times for WebAssembly can be sub-millisecond.

It'd also be interesting to see comparisons to the Java and .NET runtimes on AWS Lambda.

hylaride|6 months ago

Java startup times would almost certainly be worse, depending on what's going on.

A previous job I worked at ran Java on AWS Lambda. We ran our busiest Java lambda in a docker layer as our whole build system was designed around docker and from a compute performance point of view it was just as fast.

The main issues were:

* Longer init times for the JRE (including before the JIT kicks in). Our metrics had a noticeable performance hit that lined up to the startup of freshly initialized lambdas. It was still well within a tolerable range for us, though.

* Garbage collection almost never ran cleanly due the code suspension between invocations, which means we had to allocate more memory than we really needed.

The native AWS Lambda Java's 'snap start' would have helped, but the startup times were just not a big deal for our use case - we didn't bother with provisioning lambdas either. Despite the added memory costs, it was also still cheap enough that it was not really worth us investigating Java's parallel GC.

So as always, what language one should use depends on your use case. If you have an app that's sensitive to "random" delays, then you probably want something with less startup overhead.

syrusakbary|6 months ago

If you want to run in a WebAssembly runtime, perhaps Wasmer can be a good choice for you.

Note: we don't support .NET or Java atm, but we support PHP and Python is about to be fully supported!

https://wasmer.io/products/edge

zokier|6 months ago

Rust/C++ lambda cold starts are in the 15ms ballpark, it is very unlikely that you are going to get anything much faster than that. Spinning up firecracker vm just inherently takes some time no matter what you run inside it.

https://maxday.github.io/lambda-perf/

Hamuko|6 months ago

Cloudflare offers Wasm for their Workers service. Haven't tested it though.

https://developers.cloudflare.com/workers/runtime-apis/webas...

ahmedhawas123|6 months ago

I've approached this historically by committing persistent/reserved instances so you always have a few instances running. This is nice on paper but feels like you're omitting what a more production-appropriate solution is. "Cold starts" aren't just slow because of init, they're also slow because that's when lots of database connections, state starts, etc happen and managing init speed won't solve that.

spankalee|6 months ago

This is the first time that I've heard of LLRT, so here's a link for anyone else interested: https://github.com/awslabs/llrt

I still like Proffor's approach because by compiling to WASM we could have latency-optimized WASM runtimes (though I'm unsure what that might entail) that would benefit other languages as well.

keepamovin|6 months ago

I wonder if the author is aware of Node native features that improve startup times, like V8 code cache and startup snapshots. An overview of integrating them into native single-executable applications is here:

https://nodejs.org/api/single-executable-applications.html#s...

halamadrid|6 months ago

Nice we haven’t faced this cold start problem. We like the idea of Lambdas being offered in a simple runtime platform where you can store and run the code as needed.

And chain it with other stuff as well which is where workflow engines like n8n or Unmeshed.io works better. You can mix up lambdas in different languages as well.

WillDaSilva|6 months ago

This is exciting to see. I run some latency sensitive code on Lambda with the Node runtime, so cold starts are troublesome. I hope I'll be able to use this one it's in beta or fully released.

Comma2976|6 months ago

Title had me excited before I read past the first two words

ahmedhawas123|6 months ago

Haha, love it

voat|6 months ago

Paring this with Fil-C would give it a garbage collector for free?

mangomountain|6 months ago

This is cool attempt to make a lot of JavaScript run faster in lambda. I personally got a significant decrease in cold start and runtime by switching from js to golang and would recommend as well

Terretta|6 months ago

This is seriously snappy, and impressive work.

bastawhiz|6 months ago

Tl;dr

Use an experimental (as in, 60% of ECMA tests passing, "currently no good I/O or Node compat") AOT compiler for JS. You remove the cold start by removing the runtime, at the cost of maybe your JavaScript working and not having a garbage collector.

jfengel|6 months ago

It might be reasonable to go without a garbage collector if your whole lifetime is so short. GC by decapitation. Kinda clever.

But other than that it's impossible to assess performance with such a tiny toy.

thdhhghgbhy|6 months ago

No big corporate will ever use this, they'd be too worried about the compiler being compromised in some way. Llrt will never go primetime either, so we're stuck with the full Node runtime for a while.

LtWorf|6 months ago

Corporations use pypi and npm, I think they don't care about being compromised.

69 comments