(no title)
rolae | 2 years ago
But to give some context: the author Aaron Patterson is a Ruby and a Rails core team member. The article and headline is clearly targeting the ruby community, where this article has been very well received. I think it's a good title for the intended audience.
The post clarifies in the first section:
> In this post I’d like to present one data point in favor of maintaining a pure Ruby codebase, and then discuss some challenges and downsides of writing native extensions. Finally we’ll look at YJIT optimizations and why they don’t work as well with native code in the mix.
edit: added original title of the hackernews post / article
wonnage|2 years ago
The JS ecosystem has the same problem, people think rewriting everything in Rust will be a magic fix. In practice, there's always the problem highlighted in the post (transitioning is expensive, causes optimization bailouts), as well as the cost of actually getting the results back into Node-land. This is why SWC abandoned the JS API for writing plugins - constantly bouncing back and forth while traversing AST nodes was even slower than Babel (e.g https://github.com/swc-project/swc/issues/1392#issuecomment-...)
mananaysiempre|2 years ago
Parsing has always been one of the things its tracing JIT struggled with; it is still faster than the (already fairly fast) interpreter, but in this kind of branch- and allocation-heavy code it gets nowhere near the famed 1.25x to 1.5x of GCC (or so) that you can get by carefully tailoring inner-loopy code.
(But a tracing JIT like LuaJIT is a different from a BBV JIT like YJIT, even if I haven’t yet grokked the latter.)
LuaJIT’s FFI calls, on the other hand, are very very fast. They are still slower than not going through the boundary at all, naturally, but that’s about it. On the other hand, going through the Lua/C API inherited from the original, interpreted implementation—which sounds similar to what the Ruby blog post is comparing pure-Ruby code to—can be quite slow.
The SWC situation I can’t understand quickly, but apart from the WASM overhead it sounds to me like they have a syntax tree that the JS plugin side really wants to be GCed in the GC’s memory but the Rust-on-WASM host side really wants to be refcounted in WASM memory, and that is indeed not a good situation to be in. It took a decade or more for DOM manipulation in JS to not suck, and there the native-code side was operating with deep (and unsafe) hooks into the VM and GC infrastructure as opposed to the WASM straitjacket. Hopefully it’ll become easier when the WASM GC proposal finally materializes and people figure out how to make Rust target it.
In any case, it annoys me how hard it is in just about any low-level language to cheaply integrate with a GC. Getting a stack map out of a compiler in order to know where the references to GC-land are and when they are alive is like pulling teeth. I don’t think it should be that way.
vidarh|2 years ago
In a language like Ruby it tends to be heavily dominated by scanning text, and creation of objects, and 1) you can often speed it up drastically by reducing object creation. E.g. here is Aaron writing about speeding up the GraphQL parser partly by doing that[1], 2) creating Ruby objects and building up complex structures in the C extension is going to be almost exactly as slow in the C extension as in the Ruby, 3) the scanning of the text mostly hits the regexp engine which is already written in C.
(That said, I heavily favour not resorting to C-extensions unless you really have to; even without going as far as some of Aaron's more esoteric tricks for that parser you can often get a whole lot closer that you think, and the portion you need to rewrite if you still have to might well turn out to be much smaller than you'd expect)
[1] https://tenderlovemaking.com/2023/09/02/fast-tokenizers-with...
danmur|2 years ago
ksec|2 years ago
Chris Seaton has been stating this for over 5 years. It is unfortunate this mental model has never caught on in Rails.
chucke|2 years ago