top | item 22715947

JITs are un-ergonomic

145 points| awinter-py | 6 years ago |abe-winter.github.io | reply

255 comments

order
[+] lpghatguy|6 years ago|reply
An anecdote from a non-JS JIT, but similar: I once spent a summer working on a game engine with a couple others where the host language was LuaJIT.

It started out great. The iteration cycles were incredibly short, performance was competitive, the code was portable, and we could do wild metaprogramming nonsense that was still fast. If you haven't worked with LuaJIT, its C FFI is also incredible!

As we started scaling though, the wheels fell off the wagon. We'd add a new feature and suddenly the game wouldn't run at interactive framerates. One time, it was the 'unpack' function, which would trigger a JIT trace abort. We would drop from 12ms frames to 100ms frames. I wrote a length-specialized version that didn't abort and moved on.

Another time, it was calling Lua's 'pairs' method (iterator over a map). Okay, so we can't do that, or a few other things that made Lua productive before.

The other problem we hit was GC predictability being impossible. We tried to mitigate it by using native data structures through the C FFI, taking control over the GC cycle to run it once or twice per frame, etc. In the end, like the JIT problem, we weren't writing Lua at the end, we were writing... something else. It wasn't maintainable.

That summer ruined dynamic languages for me. I didn't really want to be writing C or C++ at the time. I ended up picking up Rust, which was predictable and still felt high-level, and the Lua experience ended up getting me my current job.

[+] MrBuddyCasino|6 years ago|reply
Thanks for the story details, quite interesting. In the end this was unfortunately a case of picking the wrong tool for the job. Don’t use JITed / GC languages when you‘ve got hard realtime requirements.

Don’t build a datastore on the JVM if you care about tail latencies, you‘ll be fighting the GC forever (see Cassandra). Don’t rely on auto-vectorisation in your inner loop if possible, one tiny change could bring that house of cards crashing down.

I‘d be interested in how your team ended up picking that tech stack. Was it a „rational“ weighing of options with pros and cons? Was it „eh it‘ll be alright“? Was it personal preference and/or prior experience?

[+] samatman|6 years ago|reply
'pairs' was jitted recently by the RaptorJIT crew, fwiw.

I've hit fewer snags with LuaJIT, but they're definitely there. Really wish Mike Pall had written that hyperblock scheduler before retiring...

[+] beetwenty|6 years ago|reply
I'm working with Lua right now(gopherlua) as a scripting option for real-time gaming. I've done similar things to your story in the past with trying to make Lua the host for everything and I'm well aware of the downsides, but I have a requirement of maintaining readable, compatible source(as in PICO-8's model) - and Lua is excellent at that, as are other dynamic languages, to the point where it's hard to consider anything else unless I build and maintain the entire implementation. So my mitigation strategy is to do everything possible to make the Lua code remain in the glue code space, which means that I have to add a lot of libaries.

I'm also planning to add support for tl, which should make things easier on the in-the-large engineering side of things - something dynamic languages are also pretty awful at.

[+] pron|6 years ago|reply
The OpenJDK JVM (aka Hotspot) addresses both issues: control [1] and monitoring [2] (there are built-in compilation and deoptimization events emitted to the event stream). You can also compile methods in advance [3], and inspect the generated machine code when benchmarking [4]. You can even compile an entire application ahead-of-time [5] to produce a native binary.

[1]: https://docs.oracle.com/en/java/javase/14/vm/compiler-contro...

[2]: https://docs.oracle.com/en/java/javase/14/jfapi/why-use-jfr-...

[3]: https://docs.oracle.com/en/java/javase/14/docs/specs/man/jao...

[4]: http://psy-lob-saw.blogspot.com/2015/07/jmh-perfasm.html

[5]: https://www.graalvm.org/docs/reference-manual/native-image/

[+] zamalek|6 years ago|reply
The article [incorrectly] equates all JITs with the author's experience with V8. The article even states "faster than python, but slower than Java" which makes no sense because Java is a JITted language.
[+] amelius|6 years ago|reply
The GC is still nondeterministic though.
[+] banachtarski|6 years ago|reply
This article has a number of issues. JS with JIT is waaay faster than Python. Not “between python and java” as purported. Second, generalizing jits as “un-ergonomic” seems silly given that what’s being specifically looked at is benchmarking. But what makes this claim ridiculous is that nothing is easy to benchmark. Even native code is hard to profile and this is literally my day job. If the JIT makes your code that much faster, this strikes me as a pretty suspect complaint
[+] pizlonator|6 years ago|reply
I think that by “between python and java” they meant “faster than python and slower than java”. I think Java still beats JS unless you get lucky.

You’re totally right that benchmarking and profiling is hard even for native code. I think this post fetishizes whether or not a piece of code got JITed a little too much. Maybe the author had a bad time with microbenchmarks. There’s this anti pattern in the JS world to extract a small code sample into a loop and see how fast it goes - something that C perf hackers usually know not to do. That tactic proves especially misleading in a JIT since JITs salivate at the sight of loops.

[+] imtringued|6 years ago|reply
That was not a general statement about Javascript performance. The entire article is about the unpredictability of the JIT. When the JIT hits a bad codepath then it really does perform like python or when it hits a good code path (most of the time) then it performs like java. This unpredictability is what is causing the issues, not the general performance.
[+] j88439h84|6 years ago|reply
Pypy JIT is way faster than CPython
[+] fnord123|6 years ago|reply
>JS with JIT is waaay faster than Python. Not “between python and java” as purported.

You are correct that JS is not between Python and Java. Python is faster than Java which is faster than JS. Though some people seem to think calling APIs written in C is "not Python" but if the ecosystem provides the library and I call it from Python then it's Python enough for me!

[+] connor4312|6 years ago|reply
As someone who's been writing a lot of JavaScript, Go, and a handful of other languages for a while, I feel this. In Go, I can basically know what's going to happen when I write a function. This operation will read from the stack, these instructions will be run, and I can take a peek at the assembly if I'm not sure (though I've developed a pretty good feel for what Go will do without needing that). I can benchmark it and know that the performance I see on my machine will be the performance when I ship this bit of functionality into production, barring hardware differences.

In JavaScript, it's a black box. I know some constructs might deoptimize functions when run on Wednesdays because I read them on a blog published in 2018 that's _probably_ still accurate. In my benchmark running on Node 12.14.1 on Windows this seems to be true. But then who knows if it'll be the same thing in production, and it might 'silently' change later on.

JavaScript in V8 is incredibly fast these days, but I find it much easier to write optimal code in Go.

[+] _bxg1|6 years ago|reply
> if your economics are such that servers are a bigger cost than payroll

Sorry, and I may be oversimplifying the author's situation, but this really sounds like a case where you need to not be using JS for your server. On the client you don't have much choice, but on the client the pure-JS performance rarely gets tight enough to warrant this degree of micro-optimization work.

Author makes some good points - it would be great if the JIT were more profiler-friendly - but I have to question a little bit how important it actually is, the way the use-cases line up.

[+] lispm|6 years ago|reply
> Interpreters, which read the program line by line as it runs

Byte code interpreters do that? That would be surprising. Programs are represented with 'lines' in byte code?

Things he wants, let's look at SBCL, a Common Lisp implementation (see http://sbcl.org ):

> compile short bits to native code quickly at runtime -> that's done by default

> identify whether a section is getting optimized -> we tell the compiler and the compiler gives us feedback on the optimizations performed or missed

> know anything about the native code that’s being run in a benchmark -> we can disassemble it, some information can be asked

> statically require that a given section can & does get optimized -> done via declarations and compilation qualities

> compile likely sections in advance to skip warmup -> by default

> bonus: ship binaries with fully-compiled programs -> dump the data (which includes the compiled code) to an executable

[+] dahart|6 years ago|reply
A JIT is just another cache, like memory. Yes, it’s hard to predict, but not fundamentally that different from caching in any language. It does mean perf tests have to be end-to-end and match real-world loads, but it doesn’t mean it’s “impossible” at all, it means you need to measure.

Is this a real problem? I’ve been profiling my JS for years and never actually run into a mysterious problem where some important code I profiled was way way slower in prod than when I was profiling. Has that happened for you? How often does this happen? I take it as an assumption that profiling is something you mostly do on inner loops & hot paths in the first place. I mean, I profile everything to look for bottlenecks, but I don’t spend much time optimizing the cold paths.

> Get notified about deopts in hot paths

Studying the reasons for de-opts help you know in advance when it might happen. If you avoid those things, do-opts won’t happen, and you don’t need notifications.

For example, ensure you don’t modify/add/delete keys in any objects, make sure your objects are all the same shape in your hot path, don’t change the type of any properties, and you’re like 90% there, right?

> statically require that a given section can & does get optimized [...] compile likely sections in advance to skip warmup

While these don’t exist in V8, it’s maybe worth mentioning that the Google Closure compiler does help a little bit, it ensures class properties are present and initialized, which can help avoid de-opts.

[+] inglor|6 years ago|reply
Hey Node/bluebird person here: You want to run Node with --teace-opt and --trace-deopt and --allow-natives-syntax with %OptimizeFunctionOnNextCall before benchmarking.
[+] cjfd|6 years ago|reply
The high level point that is of great importance is that if one wants a computer program to function in a reliable way one needs simple and understandable algorithms/components. Nowadays even the processor is no longer simple and assembly has become a high level language and it got us 'nice' things like spectre. For the practical day to day work the KISS principle that has been the corner stone of effective programming for half a century is now more important than ever. Yes, there is a nice library available to do such and such and maybe you need it, but are you also thinking about the dangers that any additional moving part may increase unpredictability? Let me give one very stupid example that I ran into this very week. The agreed-upon answer according to https://unix.stackexchange.com/questions/29608/why-is-it-bet... is that it is better to do #!/usr/bin/env bash than #!/usr/bin/bash. I say: absolutely not! You have just increased the number of moving parts involved by one for an immeasurably small benefit. And if you are worrying about bash versions then just stop using any bash features that are less than a decade old. I also say that unless it is necessary for the core problem that you are trying to solve stay away from any features of anything that are less than 5 years old. And if you actually need the new and fancy stuff, and maybe you do, expect to pay a hefty price for it. Every new tool that you introduce has its own peculiarities that you will spend hours of debug time on and one should err on the side of just saying no.
[+] samatman|6 years ago|reply
LuaJIT, true to form, has a sort of solution for this, in the form of a profiler cheap enough to run in production.

You do have to change how you think about performance analysis, but in return, you get to actually answer the question you're trying to reason about, namely, how does this run in production.

Pacifying the JIT is a bit of a dark art, but the whole thing is pretty transparent with good tooling. I've yet to regret building on LuaJIT.

[+] thu2111|6 years ago|reply
Worth noting that the JVM has this feature as well, under the name of "flight recorder".
[+] BorisTheBrave|6 years ago|reply
This article contains a logical error. The premise is JITs are hard to benchmark and keep good performance on. That's true.

But the alternative is bad performance all the time (the JITS fall back to interpretation, after all).

What's the value in having clearly understood bad performance? If you care enough about performance that you need to understand it, surely you care about the absolute level of performance.

[+] pizlonator|6 years ago|reply
This post is extremely V8-centric. For example it uses terminology like “deopts” which means nothing in JavaScriptCore (we distinguish between exits, invalidations, jettisons, and recompiles). The post also assumes that there is only one JIT (JSC has multiple).

And that’s where you’ve lost me. Not sure how you expose anything about how the JIT is operating without introducing a compat shitshow since JIT means different things in different implementations.

If you really want to know that something gets compiled with the types you want, use a statically typed and ahead of time compiled language.

If you have to use a JIT but you find that it doesn’t do what you like then remember that it’s meant to be stochastic. The VM is just trying to win in the average. Which functions get compiled and with what type information can vary from run to run.

Probably the best thing that could happen is that developer tools tell you more about what the JIT is doing. But even that’s hard.

There are some specifics that I disagree with:

- I don’t think all JIT architects for JS claim that the perf is about competing with C for numerical code. I don’t explain it that way. I would say: JITs are about doing the best job you can do under the circumstances. They can make JS run 4x faster than an interpreter if things really go well. “Between Python and Java” is a good way to put it and that’s exactly what I would expect. So if that’s your experience then great! The JIT worked as expected.

- It’s usually foolish to want your code compiled sooner. Compilation delay is about establishing confidence in profiling. I’m pretty sure we’d JIT much sooner if it wasn’t for the fact that it would make the EV of our speculation go negative.

TL;DR. the JIT can’t unfuck up JavaScript.

[+] csande17|6 years ago|reply
The post is likely V8-centric because it's discussing server-side JavaScript. As far as I know, pretty much everyone uses V8 for server-side JS because that's what Node is based on. And compatibility seems to be much less of an issue there as well, at least if all the "dropped support for Node version X" changelogs I read are any indication.

Not to say server-side JavaScript is necessarily a good idea. But if people are determined to make it work, I could see these sorts of changes happening that wouldn't make sense in a web browser.

[+] bjourne|6 years ago|reply
> Probably the best thing that could happen is that developer tools tell you more about what the JIT is doing. But even that’s hard.

Why? You have the exact same problem when writing SQL code but there you have lots of powerful introspection tools to make it easier to control performance. You can also use indices and hints to nudge the RDBMS into executing queries in the most optimal way. PyPy for example has a lot of support for introspection.

[+] peterkelly|6 years ago|reply
If you were to get the information you were after it would be specific to a particular implementation, and a particular version of that implementation. The nature of how JIT is done for JavaScript is not defined in the spec and varies a lot (starting from totally nonexistant many years ago).

To have useful performance measurement tools that are going to help your code run across multiple implementations would require a) all of those implementations to work the same, forever, and b) the semantics of this to be specified in the standard.

[+] claytongulick|6 years ago|reply
The thing is, javascript performance is mostly "good enough", and honestly that's what really matters.

When you have workloads that are mostly io bound, having syntactic sugar like js async/await to avoid blocking is really a huge strength.

When we write systems, we operate under constraints, and frequently the largest constraint is time to market rather than pure performance.

Dynamic typing can be a huge strength for TTM.

If performance was the only concern, we'd all be in straight C with inline asm.

[+] smabie|6 years ago|reply
I'm not convinced that dynamic typing helps productivity. In my experience, an expressive type system allows one to deploy code much more quickly. The key word, of course, is expressive. Neither Go, nor Java, btw, qualify as having expressive type systems.
[+] gentleman11|6 years ago|reply
As a nodejs developer: what is the best compiled server language to learn right now? Is it Java still, or is it better to look at go or rust?
[+] _delirium|6 years ago|reply
Maybe I'm missing the intended distinction, but in normal usage, Java is also a JITted language. It's almost the canonical one... JITs are much older, but HotSpot really popularized the concept in mainstream production.
[+] Skunkleton|6 years ago|reply
If you are are looking for a mature high performance language to keep in your back pocket for the cases where you can't use node, Java seems like a fair bet. Go would also be useful. Neither are too much fun IMO, but that would fit the bill.

There are newer languages that are more interesting, but picking one of those is more about either attempting to anticipate where the industry is going, or just plain wanting a more fun language. Rust is an example here. Personally I think rust is worth learning, and has a fair shot at taking over a large chunk of software development that would have previously gone to C or C++. Learning rust might make you a better programmer because lots of poor-programming-hygiene type things are compiler errors in rust.

If you just want a way to wring maximum performance out of a piece of hardware, then I would caution you to consider your choice carefully. Writing high performance code requires a complete understanding of the stack underneath you. If you want to see my point in practice go check out the programming languages shootout. Specifically notice that there are benchmarks implemented in "fast" languages that are outperformed by better optimized code written in "slow" languages.

I would recommend against C++ unless you are planning on devoting a very large amount of time to learning it well. I also don't personally like C++. Too many footguns.

[+] jitl|6 years ago|reply
It depends on a lot of factors. I’m a typescript Dev and I’m really enjoying working in Kotlin. I certainly enjoy how expressive it is compared to Go, and I picked it up in a couple of days in an existing codebase, so it’s much easier than Rust. I think you can’t undersell how good the JetBrains IDE makes the programming experience with code completion, auto-imports, corrections, or how great it is that Kotlin interfaces with Java seamlessly.
[+] csande17|6 years ago|reply
Part of me wonders if ASP.NET will make a comeback. A lot of the historical reasons not to use it don't exist anymore (there's an officially-supported Linux port, for example), and a lot of the cool new JavaScript language features like async/await and arrow functions came from C#.

Besides, most JavaScript programmers are already dependent on Microsoft products like TypeScript, NPM, and Visual Studio Code. What's one more?

[+] didibus|6 years ago|reply
Java is still the largest and most in use server side language at 39%. It is followed by C# at 32%.

After that, you have Go at 9%, Kotlin (also a JVM Lang) at 6.6% and Scala (also a JVM Lang) at 4%.

Finally, you get Elixir and Clojure both at 1.5%

Source: https://insights.stackoverflow.com/survey/2019

So yes, I think learning Java and the JVM ecosystem is a good investment, considering Java dominates, and if you tally up JVM langs you get something like 51% of professional developers using a JVM based language for work.

[+] wvenable|6 years ago|reply
Best is highly subjective and entirely dependent on what you want to build.

I find a lot of joy in working in C# and the .NET core. It's very pleasant to work in and might be worth a look.

[+] DethNinja|6 years ago|reply
In my experience Go is amazing for webserver development. Very easy to learn and you can build things very quickly.

However, I still use C++ for extremely complex networked applications where performance matters, unfortunately Rust doesn’t have battle tested crypto libraries, so I wouldn’t personally use it for networked applications.

[+] on_and_off|6 years ago|reply
kotlin is pretty neat to work with.

In the end it is all bytecode (so java, or rather both java and kotlin output bytecode), but kt removes a lot of the ceremony, boilerplate, and altogether ideas that have proven to do not be that great in the years since java's inception.

[+] gaogao|6 years ago|reply
To learn for what goal? For education, for enterprise, for speed?

To which my shortlist is Haskell, C#, and Rust.

[+] tehlike|6 years ago|reply
C# is pretty decent. Has been since the get go.
[+] fulafel|6 years ago|reply
JS is also frequently a compiled language these days. What makes you qualify the question this way?
[+] smabie|6 years ago|reply
I would say Scala, or maybe F#.
[+] gridlockd|6 years ago|reply
I completely agree, it's possible to do some tracing of what the V8 JIT does, but the workflow is awful.

Microbenchmarks don't represent the real world. Instead of running a single microbenchmark, I suggest running a host of them, but all of them "at the same time" (not serially). They'll get in each others ways and total performance will be far worse.

This happens with AOT code as well of course, because caches are being trampled no matter what, but JIT code only exarcerbates the issue because it is larger.

[+] arkanciscan|6 years ago|reply
Not a big deal if you're mainly writing client-side JS. You're never gonna know how fast something will run in a browser anyway.
[+] anthk|6 years ago|reply
I'd love a jit for DosBOX-X, or Qemu. I know, KVM, but think about emulating non-native archs. Or systems without KVM support.
[+] gok|6 years ago|reply
Dead on. JITs are high interest credit card technical debt when it comes to performance. It says a lot that all the performance-sensitive parts of widely deployed JITs are themselves implemented with AOT compilation.
[+] saagarjha|6 years ago|reply
> It says a lot that all the performance-sensitive parts of widely deployed JITs are themselves implemented with AOT compilation.

…not really? Managed languages with a substantial runtime tend to make for poor ergonomics when writing JITs.