(no title)
innocenat | 1 year ago
The problem is that superscalar CPU needs to decode multiple x86 instructions per cycle. I think latest Intel big core pipeline can do (IIRC) 6 instructions per cycle, so to keep the pipeline full the decode MUST be able to decode 6 per cycle too.
If it's ARM, it's easy to do multiple decode. M1 do (IIRC) 8 per cycle easily, because the instruction length is fixed. So the first decoder starts at PC, the second starts at PC+4, etc. But x86 instructions are variable length, so after the first decoder decodes instruction at IP, where does the second decoder start decoding at?
kijiki|1 year ago
jart|1 year ago