(no title)
pso | 5 years ago
All of the warmup, and transitions from interpreter, to JIT, to optimised JIT , happen inside the first few micro or milliseconds of EVERY one of their thousands of process iteration. Their measurements are ALL of the system variation of the VM after warm up has taken place. The VM is optimizing within the first 1-1000 inner loops occuring at the start of EACH process iteration. For most working programmers, a variation of a few percent on a running system AFTER warm-up in "steady-state peak performance", and before any I/O takes place (because language benchmarks avoid I/O), would not be an issue. If it is an issue, then the article perhaps demonstrates that a compiled language would offer less variation.
The benchmarks listed range from a shortest of around 0.4s for fannkuch/hotspot/linux, up to 1.8s for n-body, pypy, linux. This 'long-running' benchmark code (of .4 to 1.8s ), by definition, has to include multiple inner loops/hot code, which is quickly optimized, otherwise benchmark code would have to be millions of lines long, in order to have a sufficient runtime length. Tests need to run for at least tenths of a second, for cross language comparisons, since JITted languages take some iterations to warm-up.
hedora|5 years ago
They’re trying to show that “warmed up steady state” isn’t something that reliably exists.
pso|5 years ago
The final graph shows a binary trees program in C, with a 6% variation between "in process executions", and no steady state, it seems logical that most VMs will show the same or worse variation.
The "warmed-up steady state" does exist, but not if they define it so narrowly. All of their iterations and timings are running at x30 to x100 interpreted speed, the only 'cold' interpreted code is in a few microseconds of the first loops of an execution.