top | item 46447490

(no title)

cb321 | 2 months ago

Haven't checked graalvm in a long time. So, I got graalvm-jdk-25.0.1+8.1 for x86_64. It's a lot faster than Julia, and maybe 43ms is not slow in "human terms", but it's still pretty slow compared to some other competition. This was for a helloworld.jar [1]. On my laptop (i7-1370P, p-cores) using tim[2]:

    $ tim "awk '{}'</n" 'tcc -run /tmp/true.c' 'perl</n' 'py2</n' 'py3</n' 'java -jar helloworld.jar>/n'
    97.5 +- 1.5 μs  (AlreadySubtracted)static dash Overhead
    94.9 +- 3.2 μs  awk '{}'</n
    376.7 +- 4.6 μs tcc -run /tmp/true.c
    525.3 +- 2.7 μs perl</n
    3627.7 +- 6.0 μs        py2</n
    6803 +- 11 μs   py3</n
    42809 +- 71 μs  java -jar helloworld.jar>/n
Also, probably there is some way to tune this, but it used over 128 MiB of RSS.

[1]: https://github.com/jarirajari/helloworld [2]: https://github.com/c-blake/bu/blob/main/doc/tim.md

discuss

order

NovaX|2 months ago

That is just a normal JVM with optional Graal components if enabled, but not being used. The default memory allocation is based on a percentage of available memory and uncommitted (meaning its available for other programs). When people mention Graal they mean an AOT compiled executable that can be run without a JVM installed. Sometimes they may refer to Graal JIT as a replacement for C1/C2 available also in VM mode. You are using a plain HotSpot VM in server mode, as the optimized client mode was removed when desktop use-cases were deprioritized (e.g. JWS discontinued).

cb321|1 month ago

You are correct and I apologize for the misimpression.

`native-image -jar helloworld.jar helloworld` did take a whopping 17 seconds to compile, what might be the smallest possible project. That does makes me worry for iterations trying to get better perf in a context where startup overhead matters, BUT the executable it produced did run much faster - only about 1.8x slower than `tcc -run`:

    97.0 +- 1.7 μs  (AlreadySubtracted)Overhead
    98.2 +- 2.8 μs  awk '{}'</n
    376.4 +- 4.4 μs tcc -run /tmp/true.c
    527.9 +- 4.0 μs perl</n
    686.5 +- 3.9 μs ./helloworld>/n
Perl has 2 more shared libraries for ld.so to link, but is somehow faster. So, there may still be some room for improvement, but anyway, thank you for the correction.

(Also, I included 4 of the faster comparative programs to show additionally that the error bars are vaguely credible. In truth, on time shared OSes, the distributions have heavier tails than Gaussian and so a single +- is inadequate.)

--

EDIT: So, the ld.so/dynamic linking overhead was bothering me. I had to get a musl + zlib build environment going, but I did after a few minutes and then found this result with a fully statically linked binary executable:

    398.2 +- 4.2 μs ./helloworld-sta>/n
(I should have noted earlier that /n -> /dev/null is just a convenience symlink I put on all my systems. Also, this is all on Linux 6.18.2 and the same CPU as before.)

Also, only around 4.2 MiB of RSS compared to ~1.8 MiB for dash & awk. So, 2.4x the space and only ~4X the time of static awk & dash. That might sound like criticism, but those are usually the efficiency champs. The binary size is kind of hefty (~2x like the RAM use):

    $ size *sta
       text    data     bss     dec     hex filename
    2856644 3486552    3184 6346380  60d68c helloworld-sta
So, I guess, in 2026, Java start-up overhead is pretty acceptable. I hope that these "real numbers" can maybe add some precision to the discussion. Someone saying "mere milliseconds" just does not mean as much to me as 400 +- 4 microseconds, and perhaps there are others like me.