It begs a question though: How many instructions in <insert ISA here> are equivalent? I assume that a compiler writer has a list of equivalents and will typically choose the shorter one?
Well that 0 that you are loading comes from the instruction, so it is already "there". It boils down to the fact that the instruction is sorter.
In fact in theory the load is slower, because XOR has data dependencies on the arguments. So an out-of-order processor could be delayed. However x86 has special logic that XOR with itself doesn't carry any dependencies on the arguments.
In addition to other concerns, processors usually treat xor specially since it was the best way to zero things for so long that it became ubiquitous. Often its performance impacts are equivalent to a noop.
gruturo|3 years ago
That 0 has to come from somewhere, while in the other case XORing a register with itself does not involve loading any data. It's also shorter.
anitil|3 years ago
kevincox|3 years ago
In fact in theory the load is slower, because XOR has data dependencies on the arguments. So an out-of-order processor could be delayed. However x86 has special logic that XOR with itself doesn't carry any dependencies on the arguments.
hansvm|3 years ago