top | item 40408300

(no title)

cfallin | 1 year ago

This is indeed a good point and something I want to write about when I eventually do a blog post on weval.

A few counterpoints that I'd offer (and what led me to still take this approach):

- If the target has sub-par debugging infrastructure, it can be easier to debug an interpreter (which is portable) then apply the semantics-preserving PE. In particular when targeting Wasm outside the browser, there is... not really a good debug experience, anywhere, for that. It was way easier to get an interpreter right by developing on native with gdb/rr/whatever, and then separately ensure weval preserves semantics (which I tested with lockstep differential execution).

- Maintenance: if one is going to have an interpreter and a compiler anyway (and one often wants this or needs this e.g. to handle eval()), easier for them both to come from the same source.

- Amortized cost: in the Wasm world we want AOT compilers for many languages eventually; there are interpreter ports with no Wasm backends; developing weval was a one-time cost and we can eventually apply it multiple times.

- If the semantics of the existing interpreter are quite nontrivial, that can push the balance the other way. I designed weval as part of my work on SpiderMonkey; extremely nontrivial interpreter with all sorts of edge cases; replicating that in a from-scratch compiler seemed a far harder path. (It's since been done by someone else and you can find the "wasm32 codegen" patchset in Bugzilla but there are other phasing issues with it from our use-case's PoV; it's not true AOT, it requires codegen at runtime.)

I don't think the tradeoff is always clear and if one is building a language from scratch, and targeting a simple ISA, by all means write a direct compiler! But other interesting use-cases do exist.

discuss

JonChesterfield|1 year ago

The divergent semantics risk between the interpreter and the compiler is a really big deal. It's genuinely difficult to get a language implementation to behave exactly as specified, even when the spec is do the same as some other implementation. Treating "compiled code" as specialising the interpreter with respect to the program is a great solution to that, since the bugs in the optimiser/partial-evaluator (they're kind of the same thing) are unlikely to be of the same class as bugs from independent implementations.

Wasm is a really solid target for heroic compiler optimisations. It's relatively precisely specified, user facing semantic diagnostics are in some language front end out of sight, aliasing is limited and syscalls are finite with known semantics. Pretty much because it was designed by compiler people. You've picked a good target for this technique.

achierius|1 year ago

One problem with Wasm is that in practice there's not as much optimization work to do as you might expect, as the higher-level compiler which produced it already di a lot of the work. Of course you still need to lower the stack machine + locals to actual spills/fills/etc., but still a chunk of the work is already done for you.