top | item 41100053

(no title)

hugomg | 1 year ago

(I'm one of the Pallene authors)

For compiled code, Pallene's performance can be comparable to LuaJIT[1]. Pallene might be better in code with unpredictable branches, which don't fit in a single trace. LuaJIT has the edge in purely-interpreted code, and can inline indirect or cross-module function calls. LuaJIT is also more featureful, and has a better FFI story at the moment.

[1] https://www.inf.puc-rio.br/~hgualandi/papers/Gualandi-2020-S...

LuaJIT is really good software, it is hard to beat it as its own game. Pallene tries to get its edge elsewhere: it tracks Lua 5.4, whereas LuaJIT diverged around 5.1 and will stay that way forever. Pallene's implementation is arguably more portable, because it compiles down to C and doesn't need hand-crafted assembly.

discuss

order

kragen|1 year ago

this is fantastic, thanks! it seems like you're saying pallene is using a c compiler as its compiler backend? that seems like it could give you a big leg up on difficult optimizations like inlining recursion (compared to writing your own backend, i mean, not specifically compared to luajit)

i'll read the paper. aha, it says:

> The Pallene compiler generates C source code and uses a conventional C compiler as a backend. This is simple to implement and is also useful for portability. The generated C code for a Pallene module is standard C code that uses the same headers and system calls that the reference Lua does. This means that Pallene code should be portable to any platforms where Lua itself can run.

btw, if you're not on a microcontroller, this can even be a feasible thing to do for a jit compiler; running gcc 9.4 on a small program takes about 70 milliseconds on this micropc, 55 milliseconds to compile a small shared library. clang is of course less practical, but older gcc versions were even better, and tcc is even better, taking respectively 10ms and 7ms. you might have to cut down your header files though

the paper also has a longer section comparing pallene with luajit, which i will have to read in more detail. thank you very much for linking it!

5.4 vs. 5.1 could be an advantage for either side; minetest, for example, uses 5.1, so luajit is an option and pallene probably isn't

sitkack|1 year ago

Afaik there was a prototype Ruby JIT that used the C compiler this way and loaded the resulting code as a shared library. I did this with a Python decorator and ctypes so I could inline and hot reload my C extensions using

   @c_func
   def add1(n : int) -> int:
       """
       return n + 1
       """
I haven't touched this space in years, I think there are a couple Python libraries that do this now. I did not generate C code at runtime tho.

fsfod|1 year ago

That binary search benchmark probably triggers a trace explosion in LuaJIT like I've found quicksort does. If your lucky the function gets trace blacklisted, if not it ends up hitting the default max number of traces and throwing away all the JIT'ed code and repeating the same thing over and over.

dasyatidprime|1 year ago

> LuaJIT is really good software, it is hard to beat it as its own game. Pallene tries to get its edge elsewhere: it tracks Lua 5.4, whereas LuaJIT diverged around 5.1 and will stay that way forever.

Last I saw (admittedly some years ago), this also meant that an awful lot of the authors of Lua libraries/bindings diverged around then and de facto intend to stay that way forever and give up on anything that doesn't have LuaJIT's FFI implementation compared to mainline Lua's C bindings, and then a pile of Lua embedders also did this either because they had to keep tracking that crowd or because they really wanted the performance, and… so on. Unless this has changed in the meantime, I'm not sure what kind of edge you're hoping to get there.