top | item 37247394

Jacobin: A more than minimal JVM written in Go

295 points| yla92 | 2 years ago |jacobin.org

178 comments

order
[+] ninkendo|2 years ago|reply
> An important factor in reducing the size of the codebase and executable is that Jacobin relies on Go’s built-in memory management to perform garbage collection, and so it contains no GC code.

This breaks my brain thinking about it. A lot of what the JVM does is interpreting/JITing bytecode and ensuring it links/executes correctly, and writing that logic itself in Go is one thing. But how does Go's GC help you garbage collect objects in the JVM you're implementing?

For example, you have objects in the JVM heap, tracked by the code you're writing in Go. You need to do a GC run. How does the Go GC know about the objects you're managing in the JVM? Do you just... write wrapper objects in Go around each of them, and code up a destructor so that freeing the Go object frees the tracked JVM object? How do you inform the Go VM about which pointers exist to your JVM objects?

I realize I'm in way out of my depth here, and only have a "user"'s understanding of what GC's do, having never implemented one myself, but it seems crazy to me that Go's GC can Just Work with a VM you're writing in Go itself.

[+] felixge|2 years ago|reply
I suspect every JVM heap alloc is implemented by doing an alloc in Go. The JVM references to the object are pointers in the Go VM. So no special magic is needed. When the Go VM stops referencing an object, the Go GC will collect it.
[+] paulddraper|2 years ago|reply
> it seems crazy to me that Go's GC can Just Work with a VM you're writing in Go itself.

Far from it, it is more natural to do that than anything else.

Simplified example:

  type Array struct {
    items *any[]
  }

  type Object struct {
    fields map[string]*any
  }
These are the JVM values, and when the references to them disappear, the JVM values they reference can be GC'd as well.
[+] jsd1982|2 years ago|reply
Since the VM controls allocation of Java objects, just implement the VM to allocate the Java objects into Go's heap using Go's native allocator thereby allowing the native Go GC to clean those up when they become unreferenced.
[+] bane|2 years ago|reply
Leveraging the millions of man hours that goes into these run-time's subsystems is starting to become a "thing" I've noticed -- especially when running code not meant for them. For example, there's a Nintendo Switch emulator that I believe just uses the C# runtime's JIT instead of trying to roll their own. Lo and behold, it works and they've saved themselves thousands upon thousands of hours writing and debugging their own.

It's kind of cool actually.

I wonder if there's a future where somebody can just pick and choose language and runtime components parts to create the environment they want before even writing a line of code. We sort of do it a level lower with VMs and containers, and then pick and choose language features we want to use (e.g. C++), but I don't know of a good way to use Java's JVM, C#'s JIT, somebody else's memory profiler, another team's virtual memory subsystem etc. without writing a bunch of different pieces in different languages to get those benefits.

[+] tombert|2 years ago|reply
I would be very curious to side by side performance benchmarks between this, GraalVM, and vanilla JDK. My gut tells me (with no data to back this us) that the vanilla JVM will inch ahead once it's paid the cost of starting up but I would be interested to see how wrong I am.
[+] andrewbinstock|2 years ago|reply
Lead dev here. We've run a couple of benchmarks internally just for kicks. To create a fair comparison, you have to run the Hotspot JVM with the -Xint flag, which says interpret only. Right now our performance is anywhere from 15-25% of the speed of Hotspot with -Xint on small benchmarks. We figure that the use of go alone creates some important portion of that overhead when compared with the C++ of the Hotspot JVM. We're guessing that a well-optimized Jacobin interpreter will eventually get to 50-60% of the Hotspot's -Xint speed.

But we first want to get feature parity, before pivoting to performance. When we have feature parity, we'll run the Computer Language Benchmarks and post the results. That'll be fun to see!

[+] suresk|2 years ago|reply
I'm not the author, but I have contributed a few things to the project over the past year. The performance isn't anywhere close to the vanilla JVM - several times slower at best. It is entirely interpreted and there aren't any of the optimizations that have made their way into the JVM over the decades it has been around.

It has been a fun project to play around with for someone like me who thinks this kind of stuff is fun and interesting but will probably never get a chance to work on it full time. Cool to see it noticed, though!

[+] twic|2 years ago|reply
This JVM has no JIT, so OpenJDK and Graal will rather more than inch past it. It's not even a given that it will start up faster, since i don't believe it has been optimised nearly as hard.
[+] tgv|2 years ago|reply
I see it more as an niche opportunity to integrate bits of Java code in Go applications. The start-up costs can be "paid" when the app starts. Performance will be bad, but it might save some rewrites (provided they can get things like db connections going).
[+] jillesvangurp|2 years ago|reply
Clearly this is not going to be a high performance thing. I'd be surprised if it is even close.

It seems the goals for this are purely academic and about figuring out how Java works. They'll probably just do a simple interpreter and not a JIT compiler. That would be good enough for a POC. Additionally, they already indicated that they'll use Go's garbage collector, which won't be setting speed records with this either. And Java's typical usage of memory might actually stress it out a bit. Then there is the standard library which is going to need plenty of support for things like Threads, IO, various synchronization primitives and locks, etc. Doing that in Go is going to be a bit interesting but probably doable. Alternatively, they might just interface with native code directly and bypass the Go ecosystem. They might even reuse some things from openjdk for that. Speaking of which, native code and JNI would need to be implemented anyway.

[+] jsiepkes|2 years ago|reply
It looks like a cool research project. But realistically I would be surprised if it could beat OpenJDK in any benchmark (though I don't think that's the purpose of the project).

Writing something which is compatible with a specification is one thing, making it performant is another thing. For example it's not that hard to create a webserver from scratch but making it performant takes quite some effort.

[+] richieartoul|2 years ago|reply
This is so cool. Can’t wait to go digging around through the code this weekend. Good luck with the project!
[+] richieartoul|2 years ago|reply
I know the link stresses that correctness and code clarity are the primary goals of this project, but I’m curious if performance is a goal as well? Do you hope that people will run “real” workloads with this or use it to embed other software in their Go applications?
[+] gabereiser|2 years ago|reply
Ummm, excuse me, but where the f&$k has this been hiding? I’ve been looking for ways to extend my go applications with scripting support. I started with Lua (worked ) then Python (worked but hacky) then javascript using otto [1]. However it lacks ES6 support so having pretty OOP js code is a non-starter. I would love to have Java as a runtime that can be executed from goroutines.

[1] https://github.com/robertkrimen/otto

[+] sam_on_the_web|2 years ago|reply
Have you had a look at Starlark? Its a python-like scripting language built specifically for embedding into applications. Originally it was written in Java to be used in Bazel but its been re-implemented in go and rust and has found use in a bunch of other places.

https://github.com/google/starlark-go

[+] rainingmonkey|2 years ago|reply
What was your experience with Lua if it worked? What were the disadvantages that kept you looking for other extension languages?
[+] evacchi|2 years ago|reply
I am a fan of the Jacobin project! For your uses, you may also want to consider wazero [1], a pure-go WebAssembly runtime. Full disclosure: I am on the team :)

[1]: https://wazero.io/

[+] verdverm|2 years ago|reply
CUE is another interesting language to use from within Go, and is rather natural, given CUE is implemented in Go, but you can also do way more cool things with CUE via the Go API.

We're using CUE to validate and transform data, as input to code gen, the basis for a DAG task engine, and more

https://cuelang.org | https://pkg.go.dev/cuelang.org/[email protected]/cue | https://cuetorials.com/go-api (learn about CUE)

https://github.com/hofstadter-io/hof (where we are doing these things)

[+] hinkley|2 years ago|reply
What didn’t you like about Lua for your problem domain? It’s used quite a lot in computer games, and even nginx supports it.
[+] nazgulsenpai|2 years ago|reply
"I have spent the last eight months researching the JVM--reading the docs and articles and doing exploratory coding in various languages with which to write the Jacobin JVM."[0]

I wish I had the discipline to decide I wanted to do create a long-term pet project, spend months researching what kind of project and which programming language to implement it in, and to still be motivated enough to keep actively updating it 2 years later. Also, the code comments are educational, and the write-ups are inspiring. Great stuff.

[0] from http://binstock.blogspot.com/2021/08/a-whole-new-project-jvm... linked from article

[+] smokel|2 years ago|reply
Discipline gets a lot of attention around here.

My suggestion is to find something that piques your interest for a long enough time. Thought around the "Flow" concept [1] suggests to pick something that is just within reach of your capabilities.

And perhaps try meditation to exercise building up discipline.

It also helps to ask yourself several times per day: "What would be the smartest thing to do now?", until it becomes a habit. You will then accidentally think that phrase, which gives some freedom to override other (possibly) bad habits.

[1] https://en.m.wikipedia.org/wiki/Flow_(psychology)

[+] toasted-subs|2 years ago|reply
I'd love to make a highly integrated webserver. At this point it seems like spending tons of time getting caught up on what other people have done, why they have done it. Trying different techniques to see what actually seems to work. Where abstractions need to be and what provides the nicest transition between accessing different portions of stack. At this point I'm getting paid to work in tons of languages and integrating different stacks.

I wish somebody would unify these things but I'm running out of time in the day .

People that are able to commit time to projects like that are truly amazing.

[+] ShamelessC|2 years ago|reply
Reading the sibling responses so far, my only advice is to not take advice from random strangers on the internet about this sort of thing. You will just get bad takes tainted by survivorship bias.
[+] kaba0|2 years ago|reply
May not be the best advice, but a BSc/MSc thesis (or phd for the even more masochists) may give one just the right amount of pressure to be able to go through a bigger project. At least it did help me.
[+] abledon|2 years ago|reply
have you tried 'redbull' ?
[+] peter_d_sherman|2 years ago|reply
Very cool!

Hey, you know a feature which I would love to see (and maybe it's already in there) -- would be the ability to "orchestrate" Java code. In other words, to be able to add external event hooks from functions/procedures/methods -- at runtime...

These event hooks, when encountered, would be able to do such things as:

* Print/pipe a debug message to an external program or log or log viewer;

* The debug printer should contain the function/procedure/method name, parameter names, and parameter values automatically (there should be functionality to have them in both the invocation and after the function/procedure/method's code is complete, prior to returning to the caller, so two possible places per function/procedure/method...);

* The ability to selectively turn on/off such events at runtime;

* The ability to add additional code which could be evaluated for every such event (also at runtime), and make the determination if the event should be processed or skipped;

* The ability to do all of the above programmatically, via API...

* Some sort of GUI which automatically imports all function/procedure/method names of a running Java program at runtime, then gives the user the ability to track/log whichever ones they want, by simply selecting them to a secondary listbox of tracked function/procedure/methods...

Now, maybe some or all of that -- is already baked in there. I didn't look at the codebase long enough to know...

But if it is in there, that's awesome! And if one or more of those features are missing, then maybe a future version maintainer or forked version maintainer might be persuaded to add them in there...

(Side note: It would be nice to have the above functionality for all programming languages!)

Anyway, as said previously, looks very cool!

[+] jsd1982|2 years ago|reply
Does "capable of running Java 17 classes" imply Java >= 17 or Java <= 17?

I tried to run a Java 11 jar on my M1 Mac with $JAVA_HOME pointed at a temurin-11.0.20 JVM but no such luck:

    $ ./jacobin -jar my.jar
    Class Format Error: Class  has two package names: apple/security and com/sun/crypto/provider
      detected by file: cpParser.go, line: 241
    ParseAndPostClass: error parsing classes/module-info.class. Exiting.
    Class Format Error: Invalid access flags of MethodParameters attribute #1 in main
      detected by file: methodParser.go, line: 333
    Class Format Error:
      detected by file: methodParser.go, line: 109
    ParseAndPostClass: error parsing my.Main. Exiting.
[+] meisel|2 years ago|reply
Just curious, what's the purpose of this? Are there any production use cases it's targeting, or is it just for fun?
[+] philipov|2 years ago|reply
I heard the project repository suffers from a detached head.
[+] lopatin|2 years ago|reply
I wonder if it uses Go routines as the virtual threads implementation.
[+] kernal|2 years ago|reply
>The goal is to provide a more-than-minimal implementation of the JVM that can run most class files and JARs

As with everything, the final 10% is 90% of the work.

[+] butterisgood|2 years ago|reply
I can't help but wonder if this means Java might work on Plan 9/9front with this.

But I'm also not sure that's something I'll ever need.

[+] mohitgangrade|2 years ago|reply
I don't know much about this project, or about JVMs in general. But could this run Minecraft? :)
[+] crickey|2 years ago|reply
So the garbage collected language is written in a garbage collected language? How does that effect the system?
[+] djha-skin|2 years ago|reply
The real question is, can it run clojure JARS and does that improve clojure start time?
[+] yaqubroli|2 years ago|reply
Initially I was very confused; why would a socialist magazine have an article about the JVM?