Thank you for the link. That article says the cpu has a 630 op reorder buffer. Does that mean it has to throw out up to 630 instructions on a mispredict? That sounds huge, so I wonder if I’m misunderstanding.
On a first glance it might seem that the execution order matters because if you reorder lines in this source coxe the meaning indeed changes (e.g. the first instructions depends on an input in r2 but r2 gets a new value in the second instruction).
But if you rewrite the names of the registers (variables) and use a larger set of registers (variables) then you can express the same semantics without overwriting any registers. If you do so you can run the operations in parallel on different execution units within the same core.
The ability to run multiple functions at the same time is called "multiple (or wide) issue".
There are different kinds of functional units inside a cpu. For example, some can do basic arithmetic but not divide. At any given time, some functional units will be busy or free. The amount of effective parallelism present in the input stream you can effectively use is bound on whether you can successfully map your instructions into functional units that are free at that time.
Often, you while cannot run a given instruction because for example the multiplier unit is busy, the next instruction could be run because it depends on the divider functional unit, which at that time is free.
The ability to execute instructions out of order is called, well, "out of order execution". It depends on the ability to rename registers and keep track of actual data dependencies. The cpu needs some memory to keep track of that data.
It's unrelated pipelining (which I could give a try explaining too if anybody cares to hear a similar explanation)
ithkuil|5 years ago
But if you rewrite the names of the registers (variables) and use a larger set of registers (variables) then you can express the same semantics without overwriting any registers. If you do so you can run the operations in parallel on different execution units within the same core.
The ability to run multiple functions at the same time is called "multiple (or wide) issue".
There are different kinds of functional units inside a cpu. For example, some can do basic arithmetic but not divide. At any given time, some functional units will be busy or free. The amount of effective parallelism present in the input stream you can effectively use is bound on whether you can successfully map your instructions into functional units that are free at that time.
Often, you while cannot run a given instruction because for example the multiplier unit is busy, the next instruction could be run because it depends on the divider functional unit, which at that time is free.
The ability to execute instructions out of order is called, well, "out of order execution". It depends on the ability to rename registers and keep track of actual data dependencies. The cpu needs some memory to keep track of that data.
It's unrelated pipelining (which I could give a try explaining too if anybody cares to hear a similar explanation)
unknown|5 years ago
[deleted]
vharish|5 years ago