top | item 38125348

Bringing garbage collected programming languages efficiently to WebAssembly

426 points| kiyanwang | 2 years ago |v8.dev

257 comments

order
[+] samsquire|2 years ago|reply
This is exciting.

I think WASM is an example of a thin waist [1] with its garbage collector and N+M rather than N×M. (N languages + M virtual machines + G garbage collectors). That's a mature garbage collector in V8.

I was curious if there was a WASM to JVM and it seems there is one on GitHub, I haven't used it I was just curious because the JVM is a mature (and parallel) garbage collector.

Now I'm excited for WASM Threads for true parallelism and not just IO parallelism because I didn't think WASMGC would come out so soon.

The opportunity to solve async and parallelism and garbage collection EFFECTIVELY would strengthen WASM and not be a source of confusion or difficulty for developers. I think that's why WASI is so important, a chance to define an API as stable as POSIX.

1: https://www.oilshell.org/blog/2022/02/diagrams.html

2: https://github.com/cretz/asmble

[+] kodablah|2 years ago|reply
> I was curious if there was a WASM to JVM and it seems there is one on GitHub [...] https://github.com/cretz/asmble

While it works well, this was mostly a fun project for me and I no longer really maintain it. I hope that the ideas and explanations of how I mapped WASM IR to JVM bytecodes helps whoever does build this in a more official capacity. I don't have any plans to support WASM GC currently.

[+] dsign|2 years ago|reply
> I haven't used it I was just curious because the JVM is a mature (and parallel) garbage collector.

Slightly pedantic, but the JVM is a bit more than its arbage collectors[^1]. But it is true that the JVM has been the experimentation bed of garbage collection, not because academics have been specially daring with the JVM, but because JVM programs have been used extensively, and often so in ways that strain any single GC technique.

[^1]: https://www.baeldung.com/jvm-garbage-collectors. The Azul JVM also has its own, different one. And those are the ones I know of; I'm pretty sure I'm leaving quite a few out.

[+] chriswarbo|2 years ago|reply
> Now I'm excited for WASM Threads for true parallelism

Pedantic: threads are concurrent, not necessarily parallel. It's weird to see them called "true parallelism". I assume you mean in comparison to coroutines, but those are sequential (their execution order may be arbitrary, but we can rely on them not being concurrent).

If wasm adopts threads that would be another unfortunate WorseIsBetter situation, given that threads are probably the worst model of concurrency we've ever devised (other than concurrent COMEFROM)!

[+] apatheticonion|2 years ago|reply
> Now I'm excited for WASM Threads for true parallelism

Same, I'm very interested in using threads in the browser but sadly thus far threading in wasm on the browser requires the server providing an extremely restrictive response header making threads impractical for most applications.

[+] ChrisRackauckas|2 years ago|reply
The Julia WASM tools can make apps which support (/require) this feature. For example, you can compile the ODE solver to WASM as is demonstrated on https://tshort.github.io/WebAssemblyCompiler.jl/stable/examp.... It requires Chrome v119 in order to work out of the box due to that being the first version to enable the GC support. The front page of the WASM compiler https://tshort.github.io/WebAssemblyCompiler.jl/stable/ has more details.
[+] andyferris|2 years ago|reply
Oh wow that’s great progress!

(Hopefully the new `Memory{T}` type makes it possible to compile even more code.)

[+] asim|2 years ago|reply
This is giving me "wasm is the new llvm" vibes. Anyone else getting that? I understand what it's attempting to do and have seen some demonstration of that power but for the most part it is still very technically low level and using it often ends up being very cumbersome. I'm curious to know when people think it's ready for broader adoption as something you'd choose as a target over the current way of programming.
[+] Alifatisk|2 years ago|reply
No matter how amazing this is and how many the possibilities this opens, I can't stop thinking over the fact that browser today are extremely complex and the curve to building your own is almost too step!
[+] kaba0|2 years ago|reply
> curve to building your own is almost too step!

It’s not too steep, it is literally impossible. A whole OS is a much easier deal than a browser.

[+] influx|2 years ago|reply
Even Google and Apple didn't start from scratch, they used Webkit aka KDE KHTML.
[+] jauntywundrkind|2 years ago|reply
Building your own Linux is impossible too, but no one bitches & moans about that. No one slams Android for saying they couldn't go make their own easily.

Why would the world's most successful & available hypermedia platform need to be something one can easily implement? Of course the most successful online system in the world is complex & featureful. That's why we use it & it has won.

[+] skybrian|2 years ago|reply
That’s true, but this work is a foundation for simpler alternative runtimes that don’t need an entire browser engine to run. (consider Node or Deno, which rely on V8.)
[+] chii|2 years ago|reply
> extremely complex and the curve to building your own is almost too ste[e]p

a lot of modern things have reach this point. You cannot possibly build your own car. You cannot possibly build your own house from scratch (too much of it require specialist knowledge, like building code etc).

[+] DylanSp|2 years ago|reply
I'm impressed that this finally got built and released; I've been hearing about WASM planning to include GC for several years now, I wasn't sure if it'd ever actually happen.

I'm curious how much this will help some of the WASM-targeting languages that have issues with large binaries from building their runtime. IIRC, Blazor requires something like 1 MB just for a hello world; I wonder if WasmGC will help with that.

[+] to11mtm|2 years ago|reply
It may take some time for WasmGC to be usable by .NET. Based on the discussions the first version of WasmGC does not have a good way to handle a few .NET specific scenarios, and said scenarios are "post-post-mvp". [0]

My concern, of course, is that there is not much incentive for those features to be added if .NET is the only platform that needs them... at that point having a form of 'include' (to where a specific GC version can just be cached and loaded by another WASM assembly) would be more useful, despite the pain it would create.

[0] - https://github.com/WebAssembly/gc/issues/77

[+] AlexErrant|2 years ago|reply
> Chrome's Wasm team has compiled versions of the Fannkuch benchmark (which allocates data structures as it works) from C, Rust, and Java. The C and Rust binaries could be anywhere from 6.1 K to 9.6 K depending on the various compiler flags, while the Java version is much smaller at only 2.3 K! C and Rust do not include a garbage collector, but they do still bundle malloc/free to manage memory, and the reason Java is smaller here is because it doesn't need to bundle any memory management code at all.

https://developer.chrome.com/blog/wasmgc/

For Blazor, it'll only help with the GC - IIRC Blazor has to ship the entire dotnet runtime.

[+] jillesvangurp|2 years ago|reply
The new wasm support in Kotlin is pretty exciting. There's an experimental version of compose multiplatform that supports targeting browsers that will be using WASM. Compose multiplatform is basically Google's Jetpack Compose for Android with added support for other platforms.

As of a few days ago, IOS support in alpha and will hit beta next year. Android and desktop support are now stable. Once that stuff stabilizes, you can write UI applications that basically work on any platform.

The wasm compiler will roll out along with kotlin 2.0 which is the next major release for Kotlin and features a new compiler (k2). This looks like it should happen early next year. k2 is currently available in beta release and you can enable it in kotlin 1.9.x.

The nice thing with the Kotlin multiplatform ecosystem is that there are already a lot of libraries that work across the various platforms. So, the wasm compiler will rapidly become part of that and inherit a lot of nice libraries. Mostly all that takes is for builds to be reconfigured to target that ecosystem and implementing some platform specific behavior where that is missing.

Another interesting thing in this space is using and linking libraries written in different libraries. For example a lot of the platform specifics are probably going to be dependent on e.g. C or Rust libraries that are already available. In many cases these might be the same libraries that Kotlin native uses.

[+] craftamap|2 years ago|reply
Can somebody explain to me why this blog post (and the chrome announcement blog post) does not mention go? This gives me the impression that go can't benefit from these changes, even though it's garbage collected.
[+] nindalf|2 years ago|reply
> Code in [C, C++ and Rust] that does any sort of interesting allocation will end up bundling malloc/free to manage linear memory, which requires several kilobytes of code. For example, dlmalloc requires 6K, and even a malloc that trades off speed for size, like emmalloc, takes over 1K. WasmGC, on the other hand, has the VM automatically manage memory for us so we need no memory management code at all—neither a GC nor malloc/free—in the Wasm. In the previously-mentioned article on WasmGC, the size of the fannkuch benchmark was measured and WasmGC was much smaller than C or Rust—2.3 K vs 6.1-9.6 K—for this exact reason.

Would it make sense for the runtime to expose an inbuilt alloc library that C/C++/Rust can use? Programs could opt in to that library instead of bundling their own allocation library.

[+] ngrilly|2 years ago|reply
I'm a bit skeptical about this. This is significantly increasing WebAssembly's complexity.

Garbage collectors are a leaky abstraction: Some support interior pointers, others don't. Some support parallel tasks sharing memory, others don't. Some require compaction, making C FFI more difficult, others don't. Some require a deep integration with whatever mechanism is used for green processes/threads and growable stacks, others don't. Etc.

When looking at languages like Erlang, JavaScript, Python, or Go, the choices made at the language level are partly reflected in their garbage collectors.

That idea of a universal/generic VM supporting many languages has been tried many times, with limited success, for example with the JVM, CLR, or Parrot. What makes this different?

[+] jasode|2 years ago|reply
>That idea of a universal/generic VM supporting many languages has been tried many times, with limited success, for example with the JVM, CLR, or Parrot. What makes this different?

What's different this time is that WASM flips the timeline backwards from how Sun JVM and Microsoft CLR tried to do it.

- JVM/CLR : create virtual machine & IL instruction codes specification & runtime first, then later try to distribute widely as possible and hope for universal client adoption. In other words, the "intermediate language vm" will be so compelling that it causes industry to spread it throughout the landscape. This hope was only partially true. JVM/CLR did spread on desktop & servers but fell short on web browsers with Java Applets and Microsoft Silverlight. JVM/CLR also never got widely adopted on mobile platforms.

- WASM has opposite timeline : take something (aka Javascript) that's already distributed and adopted industry-wide and work backwards from that to create a "virtual machine & IL instruction codes specification & runtime".

In that view, Javascript (the so-called "toy language") was the 20-year long "Trojan Horse" of getting wide distribution on all clients first. Now the industry says: "Hey, did anyone notice that we finally have a universal runtime (Javascript) on server+desktop+browser+mobile?!? Let's make it fast by creating an IL runtime!!!"

There were a few technical issues such as Sun JVM not having raw pointers which makes it not a good performing target for pointer-based languages like C/C++. And MS CLR wasn't available on macOS (except for Silverlight minimal-CLR). But those technical limitations don't have as much explanatory power as the Javascript-as-harmless-trojan-horse-distribution timeline.

[+] titzer|2 years ago|reply
> That idea of a universal/generic VM supporting many languages has been tried many times, with limited success, for example with the JVM, CLR, or Parrot. What makes this different?

I would say that the JVM and CLR were not designed to support many languages. The JVM has one master, which is the Java language, and other languages that want to run on the JVM do so with a burden that increases as the distance from Java's type system increases.

Wasm is lower-level than most of the above bytecode formats. Even with Wasm GC, which adds statically-typed structs and arrays, and the function-references proposal which brings typed functions. With Wasm GC, there is also explicit support for tagged pointers (i31ref).

All of the above makes Wasm GC more general (by virtue of being lower-level) than the efforts you listed.

[+] zozbot234|2 years ago|reply
The WASM-GC support is intentionally kept as simple as possible, see https://github.com/WebAssembly/gc/blob/main/proposals/gc/MVP... . Building a "universal" or "generic" VM supporting many languages on an equally seamless basis is explicitly not a goal of WASM-GC; it is expected that each implementation might require its own hacks and special semantics on top of the basic support WASM provides, but this is okay because FFI/cross-language interop is seen as an entirely separate issue.
[+] xiphias2|2 years ago|reply
I think what makes it different is the importance of the project.

JavaScript garbage collection is here to stay, and it seems like it's the most sticky one of all the examples you mentioned.

Other languages and applications/libraries have to adapt to it even if it's inferior in many cases.

[+] adrusi|2 years ago|reply
That idea of a universal/generic VM supporting many languages has been tried many times, with limited success

What wasm is doing is something different than previous efforts. The gc facilities aren't provided for the sake of interop with other languages, or for the sake of sharing development resources across language runtime implementations. Wasm is providing gc facilities so that managed-memory runtime languages can target wasm environments without suffering on account of limitations imposed by the restrictive memory model, and secondarily to reduce bundle sizes.

Wasm can potentially support more tunable gc parameters to better suit the guest language's idiosyncrasies than can other general purpose language runtimes. And unlike the runtimes we're comparing it against, language implementers don't have to option of making something bespoke.

[+] klabb3|2 years ago|reply
> I'm a bit skeptical about this. [...] Some support [X], others don't. [...]

I am super conflicted, but I err on the side of agreeing with you here. Gut feeling says it's too complex and fragmented to be "standardized".

Generally speaking, "abstracting out" something should be done when that "thing" is actually pretty clearly defined and always end up behaving the same way. My understanding is GC does not qualify - despite its popularity we still don't know where the cut points are, in all honesty. There's a reason why languages differ so much from one another, and afaik this is still an area of quite active research.

Adding to that, intricate understanding of GC internals of multiple languages is probably restricted to a very low amount of engineers, worldwide. This makes it very difficult for curious bystanders to understand, follow and critique the progress. Remember that this is a massive career project for the proponents, so they're also incentivized to push ahead and perhaps gloss over legitimate issues.

And what if the design ends up being flawed, yet wildly popular? Then languages may be forced to conform to the WASM GC model in order to stay "modern" and attractive, while constraining their normal design space.

That said, if everything goes dandy it can be a huge win, so I can relate to the excitement as well.

[+] kaba0|2 years ago|reply
The correct question is “what makes WASM different”, not the other way around. And, mostly nothing, besides some “elegance” from being a new thing not yet full of legacy constructs. Oh, and that the browsers (or whatever is left from the browser scene with nigh everything being blink unfortunately) agreed on this one and not the others.

I personally think that GraalVM’s truffle infrastructure is much more impressive, and supporting the much wider managed language landscape may have been a better idea for wasm as well.

[+] quonn|2 years ago|reply
> the JVM, CLR, or Parrot. What makes this different?

I think primarily because the decision to run a certain language on WebAssembly comes first and now the question is how to make it fast. For the JVM or CLR there was usually no need: Just pick Java C# or run the other language natively.

[+] The_Colonel|2 years ago|reply
The blog post goes into this topic. It's basically a question of trade-off between performance and exact semantics, in some cases you will need to sacrifice minor semantic differences for meaningful performance gains. I think this is rather common when porting languages to "foreign" VMs and doesn't seem to be a showstopper. The application developers targeting WASM and other platforms of course have to test on all platforms and will avoid those cases with semantic differences.
[+] naasking|2 years ago|reply
The main difference is that WASM started with unsafe languages that have all of those features, so the developers are aware of them, and now they're trying add safety.

Prior universal VMs started with a safe language and bytecode and then added unsafety. Maybe that will change the outcome, maybe not. At the very least you can still use conservative GC.

[+] cranx|2 years ago|reply
I can see value in running code on multiple browsers on multiple OSes via web pages. I’d hate to npm install 2k+ JS libraries I don’t know or understand to be able to compile a simple Hello World Kotlin program that runs on a webpage. There does seem to be some notion in the industry that everything should be in a browser/JS, sigh.
[+] throwaway894345|2 years ago|reply
Yeah, I don't know how you could have one gc that works for everything unless that gc was very tunable/configurable and you could have different profiles for each language. Maybe the program payload includes the gc tuning parameters such that the gc performs like the GC for the program's host language?
[+] augusto-moura|2 years ago|reply
Isn't the JVM kinda of a universal GC? Putting the different GCs in the JVM itself aside, a lot of other languages actually have runtimes for Java and work very well, used in production and all. Jython, JRuby, JS engines (Rhino/Nashorn), Groovy, etc.
[+] msla|2 years ago|reply
How universal did the JVM ever try to be? Aren't its semantics pretty tied to what's legal in Java?
[+] TeaVMFan|2 years ago|reply
A number of concerns with the viability of the current WASM GC are covered here (Google translation to English):

https://habr-com.translate.goog/ru/articles/757182/?_x_tr_sl...

and the original article:

https://habr.com/ru/articles/757182/

This is from the author of TeaVM, who has 10 years of experience getting Java and JVM code to run efficiently in the browser. https://teavm.org/

TeaVM's existing transpilation of Java to JavaScript performs well (using the browsers JS GC). It will be interesting to see if WASM GC matures to the point where it is even faster.

[+] azakai|2 years ago|reply
Interesting article, thanks!

Notes on the issues mentioned there:

* The need for a manual shadow stack: This is fixed in WasmGC (in the same way it works in JS, as the link mentions).

* Lack of try-catch: This is fixed by the Wasm exception handling proposal, which has already shipped in browsers, https://github.com/WebAssembly/exception-handling/blob/main/...

* Null checks: Mostly fixed by WasmGC. The spec defines non-nullable local types, and VMs can use the techniques the article mentions to optimize them using signals (Wizard does, for example).

* Class initialization: This is a difficult problem, as the article says. J2Wasm and Binaryen are working to optimize it through static analysis at the toolchain level. Here is a recent PR I wrote that makes progress there: https://github.com/WebAssembly/binaryen/pull/6061

* The vtable overhead issue the article mentions may be a problem. I'm not aware of good measurements on it, through. There are some ideas on post-MVP solutions for method dispatch that might help, but nothing concrete yet.

* Checks for null and trapping: There has been discussion of variants on the GC instructions that throw instead of trap. Measurements, however, have not shown it to be a big problem atm, so it is low priority.

The author is right that stack walking, signals, and memory control are important areas that could help here.

Overall with WasmGC and exceptions we are in a pretty good place for Java as emitted by J2Wasm today: it is usually faster than J2CL which compiles Java to JavaScript. But there is definitely room for improvement.

[+] torginus|2 years ago|reply
I feel like WebAssembly is becoming more and more like something like the .NET CLR - a virtual machine designed for running high-level languages, rather than a virtual CPU architecture.

Continuing with this analogy, that once all the building block are in place (GC, threads etc.), it would be time for building a WebAssembly-native programming language.

[+] colordrops|2 years ago|reply
I was under the impression that direct interaction with the DOM from WASM was predicated on GC support. Does this mean that something like python could now directly access the DOM without javascript?
[+] TheBigSalad|2 years ago|reply
Man this is taking a while. I really didn't think I'd still be writing JavaScript in 2023.
[+] hexo|2 years ago|reply
I really do prefer open stuff so I have zero plans on turning wasm on in my browser. How does open web benefit from wasm, pretty please? Because open web is top priority, not enabling astonishingly complicated, compiled and locked down "apps". Did I miss something?
[+] apatheticonion|2 years ago|reply
I'm hopeful there is a future where the browser can use a wasm binary as an entry point for a website (rather than an html file). This implies access to the DOM and other web platform APIs.

Obviously this is terrible for document websites that need to be SEO optimised, however it would be great for dynamic web applications (think banking apps or Jira).

In those cases, the HTML file is just an unnecessary extra request and can actually contribute to slightly slower start up times.

Perhaps simply allowing html files to be sent in some kind of binary format where a wasm binary is inlined would be a more practical approach - as the html file does act as a kind of app manifest.

[+] bullen|2 years ago|reply
So now they are recreating the Java Applet?

I mean you could just re-enable the Java Applet and everything would be solved.

[+] zamadatix|2 years ago|reply
To the end developer it probably seems so but to the browser it's very different. Java Applets were things plumbed through the browser whereas WASM was designed as an expansion of the existing browser engines. What they enable one to do may seem very similar but WASM does it in a way that is significantly better (security, execution, maintenance, integration) for the browsing world.
[+] bob1029|2 years ago|reply
I am not a fan of web assembly as a general target for web apps. I think something like this makes sense for low-level performance-critical libraries, but not much beyond.

I've been using Blazor for some time now (server-side mode), and I feel like even this part without WASM is overkill complexity for 99% of the target audience. Adding the client-side WASM piece just kicks it into another dimension of "hard to reason with".

[+] politician|2 years ago|reply
I wish we could just exchange write-only or readonly byte buffers between WASM guests and their hosts using a sane stable API. Sadly, I no longer trust the standards folks to get something like structured data exchange (interfaces, components) to ever work correctly.
[+] Vt71fcAqt7|2 years ago|reply
Can someone explain if this could make a potential statically-typed javascript easier to standardize?

>WasmGC allows you to define struct and array types and perform operations such as create instances of them, read from and write to fields, cast between types, etc.

Would it make sense to define javascript types within this existing wasm standard? I imagine V8 is basically already doing this as wasmGC is just exposing the existing js GC, right? (Not sure aboout other browsers.) Even if we don't get static types in browsers it would be great if they could be defined so that an outside implementation could make use of it.

[+] aidenn0|2 years ago|reply
> Wasm is a low-level compiler target and so it is not surprising that the traditional porting approach can be used

Stack machines often cannot be targeted the same way as register machines; in addition, the fact that the call-stack doesn't share address space with the heap causes challenges.