top | item 42260584

(no title)

Teongot | 1 year ago

It so much that JITs became feasible, it's that bigger CPUs were less suitable for Jazelle's approach because of the behaviour of the in-order CPU pipeline.

Because Jazelle converted Java bytecodes into ARM instructions in sequence, there is no opportunity for any instruction scheduling. So a bytecode sequence like:

  // public static int get_x(int x, T a, T b) { return a.x+b.x; }
  aload_1
  getfield #N
  aload_2
  getfield #N
  iadd
would go down the pipeline as something like:

    LDR r1, [r0, #4]   // a_load1
  * LDR r1, [r1]       // getfield
    LDR r2, [r0, #8]   // aload_2
  * LDR r2, [r2]       // getfield
  * ADD r1, r1, r2     // iadd
There would be a pipeline stall before each instruction marked with a *.

On the first ARM 9 CPUs with Jazelle, the pipeline is fairly similar to the standard 5 stage RISC pipeline (Fetch-Decode-Execute-MemoryAccess-Writeback) so this stall would be 1 cycle. That wasn't too bad - you could just accept that loads took usally 2 cycles, and it would still be pretty fast.

However, on later CPUs with a longer pipeline the load-use delay was increased. By ARM11, it was 2 cycles - so now the CPU is spending more time waiting for pipeline stalls that it spends actually executing instructions.

In contrast, even a basic JIT can implement instruction scheduling and find some independent instructions to do between a load and the use of the result, which makes the JIT much more performant than Jazelle could be.

discuss

order

colejohnson66|1 year ago

Not just JITs; An out-of-order scheduler would have no issue reordering those instructions. However, Jazelle was designed for the low-end ARM processors that are all in-order (or were at the time; not sure how true that is today)