top | item 35469276

Announcing WCGI: WebAssembly and CGI

148 points| syrusakbary | 2 years ago |wasmer.io

115 comments

order
[+] joncfoo|2 years ago|reply
Next thing you know each wasm assembly will need a package format to ship assets with and have the app server provide common resources to all assemblies, e.g. db connection pools, some notion of security, etc.

Replace Wasmer with the a JVM-based app server and WASM assemblies with JVM-bytecode. The big difference is the source language doesn't matter as long as it's able to be run/replaced by WASM bytecode.

We're heading in circles in a lot of ways

[+] syrusakbary|2 years ago|reply
Indeed. The JVM did a lot of things right, however they missed three that are now solved with Wasm:

* Completely tied to an ecosystem, and incompatible with another (you could not run C programs in the JVM)

* Proprietary (vs based on an open standard)

* They couldn't run in the browser seamlessly

[+] pjmlp|2 years ago|reply
It is incredible how with so much Java hate, the WASM folks are doing their best to replicate everything we had in 2005.
[+] herdcall|2 years ago|reply
Multi language support is just one of the selling points, you didn't mention the other two: sandboxed security with explicit capabilities and high performance. E.g., I have a Web Assembly module running on the server (Fermyon) that needs explicit capabilities defined to even make a network call to a Twilio endpoint or read a local file. That means you can run a random Web Assembly module that you with confidence, just like you can typically open a random website on the browser without concern. By contast, you can't say that when running a random Java class that you don't trust.
[+] gabereiser|2 years ago|reply
I was just going to say, this road leads to OSGi...
[+] tiffanyh|2 years ago|reply
Don't forget, a mail server will be included at some point since that always seems inevitable.
[+] 0xbadcafebee|2 years ago|reply
PHP was compiled into WASM, so now you can run PHP apps "as WASM". How is this different from just running PHP, without WASM? Apparently it's faster, but also they make this claim:

"Picture running Wordpress and not having to worry about attackers breaking into your system"

uh, so, you sprinkled some WASM magic on some code and suddenly several decades worth of security research is obsolete? .....yeah, I'm gonna call bullshit. Compiling code or "running in a sandbox" does not stop attackers from breaking into your system. Might slow them down for a few months while they develop some new attacks.

[+] PoignardAzur|2 years ago|reply
Compiling a service to wasm to protect it "from itself" (ie from untrusted data) has trade-offs.

On the one hand, you lose ASLR and other security features designed for native code. On the other hand, your program becomes immune to stack smashing, so arbitrary code execution becomes a lot harder for an attacker (at least that's my understanding).

[+] jeroenhd|2 years ago|reply
The WASM sandbox is a lot stricter than running native code, or many other VMs (Java/dotnet). The difference seems to be in the design process: most normal VMs are built around the idea of quickly developing and deploying an application that can do everything a native application can do, without having to deal with platform differences. WASM was built around optimizing specific parts of web pages and was explicitly designed NOT to have a huge surface area. By default, you set it up to be little more than "memory goes in, memory comes out" without having to worry about file system restrictions or sockets like you have to with alternative VMs.

I'm not a fan on the way these features are being tacked back onto the runtime, but the approach of having to opt into them make the security boundary a lot more transparent. By disabling the file system (i.e. baking a fake one into memory) you can prevent whole classes of attacks. You won't get surprised by your logging library making network calls (log4j) if your sandbox never even enabled networking in the first place.

I'd much rather see easy and safe alternatives (like using containerization APIs) but there's still no easy way to tell your computer "run this program with a maximum amount of memory, CPU time, no network access and no file system outside this directory" that doesn't come with tons of caveats for preventing escape. For PHP, setting up a secure systemd configuration (more secure than the one that comes with your package manager) can be done, but it's still far from easy.

[+] anuraaga|2 years ago|reply
> Compiling code or "running in a sandbox" does not stop attackers from breaking into your system.

Correct - PHP is a scripting language with managed memory, quite different from C. Probably not all, but most WordPress vulnerabilities have to do with issues like sql injection, bad configuration defaults, etc, all in business logic. All of these will continue to exist when compiled to Wasm.

Assuming you actually have access to a database in the first place, while wasmer may have some vendor locked in solutions for it, generally with Wasm you won't get beyond SQLite for the near term.

[+] kristjansson|2 years ago|reply
edit: this was completely wrong, TFA is talking about server side WASM

I think the point is that a simple-enough application can be delievered as WASM and run entirely in the user's browser, so there wouldn't be any server-side system to break into? So one could ship e.g. wordpress + db + content in one bundle, and the user would be none the wiser. A wild claim, and probably self-defeating for anyone who needs to protect their content.

Otherwise, the WASM-dust at best moves the security boundry to a different service.

[+] chrismorgan|2 years ago|reply
> Consider the challenge of running PHP programs on servers. We have two primary options:

> 1. Wrap the PHP interpreter with a layer that instruments each HTTP call

> 2. Use the existing php-cgi program and simply compile it to Wasm

> Option 2 is not only faster, but it also enables any web application on Wasmer more efficiently.

I’m confused. This seems to be suggesting that php-cgi, which has to initialise the PHP environment every time, would be faster than the likes of php-fpm, which, well, I understand and presume it has significantly less overhead per request, though I’ve never benchmarked it.

I have PHP 5.6 installed on my VPS for one old site, and it takes around 27ms to start¹ (compared to under 30μs for just plain `echo`, as a closer indicator of actual process spawn overhead). PHP 8.2 might be faster, but it’s still going to be much slower than `echo`.

By simply compiling php-cgi to WASM, it will surely be doing all that initialisation for every request. Because CGI starts everything from scratch for each request, it’s inherently less efficient. In theory you could coordinate a time to snapshot the process/VM/whatever, forking from that point, but that’s not CGI any more.

All up, what they’re claiming is so completely contrary to what I would expect (and without any explanation or justification whatsoever), and kinda follows the “dust off something old to laugh at it again” trope, that I’m honestly having to check that it’s not the first of April any more (the article is dated the 6th).

So as I say, I’m confused. Option 2 seems very clearly slower and much less efficient, by the very nature of CGI. No one targets CGI (it’s been basically dead for… I dunno, close to twenty years?), because CGI is considerably worse than the alternatives. Can someone enlighten me? Have I missed or misunderstood something?

—⁂—

¹ Measured by running this in zsh and reading the “total” figure (across sixteen runs, I got between 2.671 and 3.032 seconds):

  time ( for i in {0..100}; do php56 <<<'<?="."?>'; done )
The comparative echo test uses `echo -n .` and takes one thousandth as long.
[+] jeremyjh|2 years ago|reply
To me this seems a little closer to the architecture of AWS Lambda than OG CGI, though that is not a perfect analogy either since this is in a WASM runtime within their server process, rather than a separate process. But the programming interface is a handler function you provide with an interface that looks like this in Rust:

`fn handler(request: Request) -> Response `

My understanding is the main function is called only once, and registers that handler. So `main` is where you'd initialize the majority of the environment, and no that is not truly CGI; definitely no process is being created for each request, but it may be the case that this is more like FastCGI where you have a pool of single-threaded runtimes all setup that way that can handle requests.

This still seems inefficient compared to a threaded or event polling process that can handle multiple requests concurrently without having to marshall data back and forth, but I'd think it can get closer to that than FastCGI or Lambda do.

[+] paulgb|2 years ago|reply
I've always loved the simplicity and flexibility of CGI.

To check my understanding: since CGI just takes a raw request over stdin and returns a response over stdout, would a WCGI wasm module be compatible with WAGI[1] and vice-versa?

[1] https://github.com/deislabs/wagi

[+] cassepipe|2 years ago|reply
To me it has always felt like a underspecified hack but maybe I am talking out of ignorance. (I did read the RFC though( I think it's strange idea to run to get a bunch of arguments that you have to get from the environment and/or the stdin and parse the whole of it and then try to programmatically output it all by printing to stdout. No wonder people have come up with template languages that are html supersets and that work with a preprocessor.

I don't use CGI but when I do I like the simplicity of Haserl (Basic template language + any interpreter, lua by default) : https://haserl.sourceforge.net/

[+] syrusakbary|2 years ago|reply
That's right, both WCGI and WAGI are currently compatible!

Things might evolve a bit different on the mid term, but let's see what the future holds :)

[+] benatkin|2 years ago|reply
Matt Butcher who used to work at Deis, which developed WAGI, and Microsoft after it was acquired, now has a new startup, Fermyon, which has something called Spin which I think uses a new protocol that's different from CGI. https://www.fermyon.com/blog/introducing-spin

That would make sense, anyhoo. CGI is plain text which I don't think is optimal for this stuff.

[+] evntdrvn|2 years ago|reply
The amount (and evolution) of acronyms in the WASM space is kinda overwhelming so I might be out to lunch…

At the top of the article it says “…compiling them to WASI”, but is that a semantically/technically correct statement? My understanding would be more that it should say something like “compiling them to WASI-compliant WASM” or something. Or can you actually “compile to WASI”

[+] capableweb|2 years ago|reply
WASI is just WASM outside the browser, it kind of implies what you're saying. It's still WASM, just adhering to a specific interface.

Like when you say "HTTP API" you don't necessarily need to change it to "TCP HTTP API" as it's somewhat implied (although maybe a shitty example, as HTTP is starting to appear over more things than just TCP as of late)

[+] smiletondi|2 years ago|reply
This is genuinely exciting - the prospect of running Wordpress without the usual security concerns is a game-changer. WCGI seems like it could really disrupt the server-side development landscape. Can't wait to see what other applications will benefit from this technology!
[+] smiletondi|2 years ago|reply
Full disclosure, I work at the company behind WCGI, but I truly believe this is a groundbreaking development that will have a significant impact on the industry.
[+] __MatrixMan__|2 years ago|reply
I don't understand. Why not just compile to machine code and use plain old CGI?
[+] rektide|2 years ago|reply
Others have some good reasons to also consider. Also, launching new sandboxes in wasm is supposed to be extremely extremely extremely cheap.

Where-as launching a cgi-bin executable- even a very small libcgi based one-has a significant cost, requires a lot of kernel work & context switching.

With WCGI making new "processes" is nearly free & you don't have to context switch.

A lot of the excitement around wasm in general is that it could potentially enable a communicating-processes model of computing that would be inefficient today. Even current "function as a service" paradigms tend to retain processes, have warm/cold start distinctions. With wasm there is a potential to have requests spawn not just their sandbox, but to create whole graphs of lightweight sandbox/processes. Sometimes you might hear this described as a "Nano-functions" architecture.

[+] laurencerowe|2 years ago|reply
As well as sandboxing there’s the potential for better startup performance. Wasmtime have described how they can achieve microsecond startup times using virtual memory tricks to reset and reuse a module isolated between requests. https://bytecodealliance.org/articles/wasmtime-10-performanc...

This is faster than forking a process because there are fewer operating system resources to manage.

CGI starts a new process rather than forking an existing one which makes it unsuitable for use with languages such as Python or JS which have slow initialisation times (milliseconds.) Wizer is able to snapshot a WebAssembly module to avoid that work. So in combination with the fast startup that brings initialisation down to microseconds.

Now runtimes are still somewhat slower on WebAssembly than native, and much slower for JITed languages since the JIT cannot run in WebAssembly. But there are many cases where startup time dominates and this will be faster overall for cases where you need per request isolation.

[+] jtms|2 years ago|reply
I believe the main selling point is portability and flexibility. Anything written in a language that can be compiled to wasm can now be turned into a web service.
[+] smiletondi|2 years ago|reply
Platform independence: WebAssembly allows you to compile code once and run it on any platform supporting it, saving time and effort when deploying applications across various servers compared to dealing with platform-specific binaries.
[+] VikingCoder|2 years ago|reply
Sandboxing, I believe is the answer. Portability, too, I suppose. Maybe a long-lasting archive format for older binaries...
[+] pjmlp|2 years ago|reply
Hello Java Servlets.
[+] thomasjb|2 years ago|reply
This looks good, I'd been thinking about putting my little Python program that prints a random line from a textfile onto my Apache server for the internet to enjoy, this ought to enable it nicely. Where would be best to look for examples?

The idea of exposing Python via a normal cgi script is terrifying to me

[+] nuc1e0n|2 years ago|reply
FastCGI would be better than regular CGI. Spawning and cleaning up processes is expensive.
[+] benatkin|2 years ago|reply
A bit late, there is already Spin from Fermyon