top | item 41100081

(no title)

hugomg | 1 year ago

> I can just as good implement it in C using the Lua C API

Pallene beats C when the code uses many Lua data structures. Acessing Lua data from C via the Lua-C API has significant overhead that can erase the gains from rewriting into C. Also, rewriting from Lua to Pallene is much less work than rewriting it in C.

Although Pallene is only a subset of Lua, the idea is that you use it together with Lua. It's not meant to replace Lua entirely.

discuss

Rochus|1 year ago

> Pallene beats C when the code uses many Lua data structures.

How can it beat C if it just transpiles to C? And accessing string named fields in a table is still done via hashing, even in Pallene, isn't it?

> Also, rewriting from Lua to Pallene is much less work than rewriting it in C.

Staying in LuaJIT is even less work.

hugomg|1 year ago

The difference is the Lua-C API. The default Lua-C API is designed for humans: it is stable and safe to use, but every operation must pay the cost of function calls and passing data through the Lua stack. Pallene bypasses the API and reaches into the Lua tables directly. This is much faster, but would be impractical without the Pallene compiler. The internal struct layouts are unstable, and unsafe if you're not very careful.

binary132|1 year ago

> How can it beat C

It doesn’t have the Lua to C interop overhead. You can obviously ameliorate that overhead by working on batches in C, but if you have a large and complicated dataset in Lua and need to iterate through it in C, the overhead is constantly additive so it’s certainly not just “the performance of C” when you step into C, necessarily.

If on the other hand you’re dropping into C to do something like decode a compressed stream, then the interop overhead is negligible compared to the work done in C. However, that interop overhead will be present wherever you put the boundary layer....

kragen|1 year ago

> Staying in LuaJIT is even less work.

maybe! tracking down unexpected performance regressions is more work than correcting type errors reported by compiler errors, and your luajit results suggest that typically a c subroutine (and perhaps consequently a pallene subroutine) will enjoy a 4× speed advantage over the luajit version, which might save you a lot of optimization work elsewhere