Ruby+OMR JIT Compiler: What’s Next?

[+] rurban|9 years ago|reply

> In Evan’s keynote, he proposed a really interesting and ambitious solution to the problem he called “lifting the core.” It involved shipping Ruby with LLVM intermediate representation of the CRuby functions to allow the LLVM JIT technology to look inside the CRuby functions and dramatically increase the optimization horizon. As far as I know, this hasn’t been attempted yet — although, if it has, I really want to see it!

Actually unladen_swallow and cperl are doing this. Compile the whole runtime to a lib<rt>.bc (trivial make rule), load this bitcode file (a single call), and add the expensive optimizations, esp. the inliner and IPO in llvm. Just the type-checks need to be done in the jit. This inliner does a lot of good magic esp. for the small RT functions. >10x faster.

The compiler is expensive though (i.e. very slow), and LLVM changed it's jit API 3 times already. <=3.4 jit, then mcjit and >= 3.6 ocrjit, all 3 of them still having major quirks.

With mcjit you cannot selectively add jit code to a module. So every single body needs to be a new module. So module != package/class/namespace. So the jitcache is a bit complicated. With the latest ocrjit you got problems in finding the symbols in the bc. The C API doesn't support the name resolver at all, and it's a major quirks with C++. LLVM's C API is really behind, but at least they don't change it that often as the C++ API.

unladen_swallow has a huge JIT overhead library, which nobody really needs.

For the cperl jit I just used 4 days so far and doesn't even link yet. I used the C API, not the better C++ API. No C++ for cperl. But it's very simple. https://github.com/perl11/cperl/blob/feature/gh220-llvmjit/j...

[+] Twirrim|9 years ago|reply

On the GNU side there is https://www.gnu.org/software/libjit/ that could presumably be used, but that's GPLv3 and would have consequences

[+] chrisseaton|9 years ago|reply

Wow I had no idea anyone was shipping bitcode of their runtime for dynamic compilation. Are there any blog posts, papers, etc about it? Does it work well?

[+] magaudet|9 years ago|reply

Thanks so much for posting this! That was exactly what I was curious about!

[+] holydude|9 years ago|reply

Ruby is my favorit language and i am very excited about the performance improvements by projects like this.

[+] preordained|9 years ago|reply

Seconded and seconded. Nice to see Ruby continuing to grow/improve, and not always necessarily with a rails-centric motivation or focus.

[+] tiffanyh|9 years ago|reply

I really wish someone would step forward and do what Mike Pall did for Lua with LuaJIT. Phenomenal performance gains he achieved with Lua. Essentially runs equal to C performance.

And no, just because Ruby is highly dynamic - that's not an excuse for it being slow.

To quote Mike:

" tl;dr: The reason why X is slow, is because X's implementation is slow, unoptimized or untuned. Language design just influences how hard it is to make up for it. There are no excuses. "

[1] https://www.reddit.com/r/programming/comments/19gv4c/why_pyt...

[+] chrisseaton|9 years ago|reply

I believe that Lua is significantly simpler than Ruby. Not in the language, but in the size and scope of core library. To make a Ruby application fast you also have to make a large number of core library routines fast. A real Ruby program is not much more than a chain of core library calls.

I'm working on techniques to optimise through the core library in Ruby, and it's requiring novel research techniques that clearly weren't needed in LuaJIT, as it was done without them, so there must be something extra in Ruby.

[+] edelsohn|9 years ago|reply

Repurposing a statically-typed JIT is limiting. This is why PyPy, v8, LuaJIT and HHVM all have written custom JITs specifically for the languages.

https://pdfs.semanticscholar.org/d1fc/e50f5476088671adc3910d...

[+] amaranth|9 years ago|reply

I'm pretty sure that paper is talking about an earlier version of this same work targeting Python instead of Ruby.

[+] mstoodle|9 years ago|reply

Sorry David, but I need to speak against this 5 year old paper of yours.

I strongly disagree (and always have) with your statement that repurposing a statically typed JIT is limiting. The JIT projects you mention get lots of their performance from profile data which essentially presents type guesses so the compiler can speculatively optimize and recover when that info is wrong. Guess what? That's what statically typed Java JIT compilers do too, and the OMR compiler is no different, though we haven't sunk all that capability to OMR from Testarossa just yet (it's coming!).

I would be a lot happier with you repeating, or at least not propagating specifically the opposite conclusion, this line from your own paper:

On page 16, under Conclusion: "Our point, however, is not to argue against the repurposing of JIT compilers, but to define guiding principles and to promote techniques to construct an effective RJIT compiler."

Since this article was written, the entire Testarossa code base has been substantially refactored and parts of it rewritten (by the production Testarossa team) and streamlined to make it more amenable to dynamic optimization of other languages. We haven't finished that process (there's still more stuff we need to pull into the OMR compiler technology), but I really think your provocative statement above promotes a very broad conclusion from a research project that unfairly tries to close the door on what I still believe is a very reasonable technology evolution path. Especially since the article's conclusion is based on one research project for one language and one runtime where IMO the most reasonable conclusion is that just adding a JIT to an existing runtime is not enough to compete with a project that rewrote the entire runtime. That doesn't mean existing runtimes have nothing but bespoke paths forward or that reusable JIT technology cannot play a substantial part in that runtime's future evolution.

Finally, note that the OMR technology is being used (commits pulled in on an hourly basis) to build the production J9 JVM, which is also slated to be open sourced this year. There are also projects to use OMR for Ruby (see original post :) ), Lua, Rosie, Smalltalk (SOM++), Javascript, Python (currently inactive), and a runtime for workshops called Base9. I can provide links for the ones that have been open sourced already if people can't find them on github.

For those kind enough to have read to the end of this response (thank you!), please know that we have proposed several GSoC projects [1] to port to other languages and we're pretty open minded and positive about the potential and possibilities of reusing the Eclipse OMR compiler and other runtime components. Feel free to suggest alternative projects if you have a particular area of interest since we have lots of interested mentors available!

[1] https://wiki.eclipse.org/Google_Summer_of_Code_2017_Ideas#Ec...

[+] crudbug|9 years ago|reply

Eclipse OMR is interesting, it provides good platform abstraction with support for threading, monitoring, etc.

Can you create new native languages using Eclipse OMR ?

[+] magaudet|9 years ago|reply

When you say "native languages" you mean a statically compiled language?

It's been done before with the technology underlying the OMR compiler component, however there is some work that would need to be done to support this.

In principle though, there's no reason it couldn't be done.

[+] lobo_tuerto|9 years ago|reply

What does OMR stand for? Couldn't easily find a definition for it.

[+] jwmittag|9 years ago|reply

OMR is the Open Managed Runtime. Basically, IBM took their Java implementation (the J9 JDK), or more precisely, J9's Java Virtual Machine implementation, separated all the language-specific and VM-specific parts from the language-agnostic and VM-agnostic parts, broke it up into independently re-usable modules (memory manager, JIT compiler, garbage collector, profiler, debugger) and released it as Open Source Software under the umbrella of the Eclipse Foundation.

Ruby+OMR is the proof-of-concept project that aims to show that OMR is indeed language-agnostic and VM-agnostic and can be used as drop-in parts not only for newly-designed VMs but also easily integrated into existing VMs. The Java 8 version of IBM's J9 JDK is already built on top of OMR, but you could consider that cheating, since that's the very VM OMR came out of; Ruby+OMR shows that you can also do that with languages other than Java and VMs that have a different design than J9.

I believe there is also a Python+OMR project which does similar things with the CPython VM, but that project is not (yet) public, AFAIK.

[+] thatmiddleway|9 years ago|reply

Could this make eclim useful for ruby developers?

[+] autoreleasepool|9 years ago|reply

Eclim is already fairly useful for Ruby developers. I use it for the "stellar omnifuncs" it provides YouCompleteMe as described in the official docs [0].

[0] https://github.com/Valloric/YouCompleteMe#semantic-completio...

[+] magaudet|9 years ago|reply

Alas, our connection Eclipse is only a legal one. We are an Eclipse Foundation project, but not directly related to the IDE. So won't have any impact on eclim. Sorry!

[+] iagooar|9 years ago|reply

Ruby 3x3 is here already. It's called Elixir ;)

No, seriously, if your heart beats for Ruby, you will fall in love with Elixir. Give it a try and you won't look back.

[+] deedubaya|9 years ago|reply

I really like Elixir... but to say it's ruby 3x3 isn't quite fair. While the syntax is ruby-like, even the simplest ruby program isn't directly portable to Elixir. It's definitely something different, so saying things like this is misleading.

It'd be closer to say: > Ruby 3x3 is here already. It's called Crystal ;)

Definitely check out Elixir and Crystal though if you need more performance.

[+] VeejayRampay|9 years ago|reply

I understand that you're an Elixir enthusiast, you probably feel like people have not yet seen the light and they need to be educated about superior alternatives, but two things about that:

1) It borders on proselytizing and it's annoying to see people coming over to preach the good word on every other Ruby article. If your language of choice has merits, it doesn't need hype men, people will use it naturally.

2) Elixir is not Ruby, stop trying to sell it as such. They're two completely different languages. That one has heavily influenced the other's syntax choices is merely a sign that the language author liked what he found in Ruby before creating Elixir.

[+] dragonwriter|9 years ago|reply

> No, seriously, if your heart beats for Ruby, you will fall in love with Elixir.

I like Ruby, I like Elixir, but the latter for basically none of the same reasons as the former.

[+] craigmcnamara|9 years ago|reply

It's not Ruby if it doesn't run Rails.

37 comments