top | item 38604628

(no title)

keithwinstein | 2 years ago

wasm2c (part of WABT) does this transpilation in a spec-conforming way; it passes all* the WebAssembly tests and enforces the memory-safety and determinism requirements and the rest of the spec. The memory bounds-checking itself doesn't have a runtime performance impact because it's all done with mprotect() and a segfault handler. (There are some other differences between w2c2 and wasm2c that also have to do with spec-conformance and safety; e.g., enforcing type-safety of indirect function calls. This costs <4 cycles but it's not zero.)

Re: bounds checks, the thing that consumes cycles isn't the bounds check itself, it's Wasm's requirement that OOB accesses produce a deterministic trap, even if the result of an OOB load is never observed and could be optimized out. wasm2c has to prevent the compiler from optimizing out an unobserved OOB load, and that forced liveness defeats some compiler optimizations (probably more than it needs to). But even with all that, we're talking like a <30% slowdown compared with native compilation across the SPECcpu benchmarks.

If you want to transpile arbitrary Wasm to native code in a spec-conforming way, you're probably better-off using wasm2c (which, disclosure, I work on). If you trust the Wasm module, or you're good with the isolation you get from your operating system and don't need Wasm's determinism, w2c2 seems great. Both of these are far less battle-hardened than V8 or wasmtime, especially when you include the fact that now you need an optimizing C compiler in the TCB.

---

* The Wasm testsuite repo has recently merged in the "v4" version of the exception-handling proposal, and WABT is still on "v3". But it does pass all the core tests (including tail calls) at least until GC is merged.

discuss

order

kaba0|2 years ago

So how is it any different than just adding bound checkings for normal C code?

keithwinstein|2 years ago

Well, a bunch of ways.

It's much faster to execute than adding a software bounds-check on every load. (Because the module declares its memories explicitly, it's very easy for a runtime to use a zero-cost strategy to enforce that memory loads/stores are all in-bounds.)

But Wasm's safety is more than bounds-checking memory loads/stores. E.g., Wasm indirect function calls are safe, including cross-module function calls for modules compiled separately, because there's a runtime type check (which wasm2c does very efficiently, but not zero-cost).

And, Wasm modules are provably isolated (their only access outside the module is via explicit imports). Whereas if you wanted that from "normal C code," it's a lot harder -- at some point you'll have to scan something (the source? the object file?) to enforce isolation and make sure it's not, e.g., jumping to an arbitrary address or making a random syscall. There's obviously a huge amount of good work on SFI but it's not easy to do either on "normal C code" or on arbitrary x86-64 machine code.

ncruces|2 years ago

30% is in the ballpark of what I'm expecting, actually.

SpaghettiCthulu|2 years ago

is it possible to `mprotect` less than a page? If not, how does this bounds checking work?