(no title)
drivebycomment | 1 year ago
> I once extended a Common Lisp compiler to emit machine code for SSE4.2 instructions (specifically minss and maxss). The experience was a bit bad due to subtle differences in prefixes and specific fields needing to be set to activate some mode for SSE4.2 instructions.
I assume this was some toy compiler or a non-optimizing compiler. LLVM or GCC (or any other industrial strength optimizing compilers) have no trouble whatsoever dealing with any of those. The difficulty with more complex instructions like vector instructions is in optimization / being able to find the code pattern that can take advantage of the complex instructions, and that has nothing whatsoever to do with them being variable length encoding or prefixes or knowledge about instruction set themselves. If the program is already written for it - e.g. using intrinsics - emitting and mapping to the machine code is trivial, regardless of how complex the instruction encoding rule is.
koito17|1 year ago
I dont know the definition of "toy compiler", but compare the following (x86 backend vs arm64 backend)
https://github.com/Clozure/ccl/blob/d960a0e/compiler/X86/x86...
https://github.com/Clozure/ccl/blob/d960a0e/compiler/ARM64/a...
I would argue the former is a lot more complex compared to the functional equivalent in arm64
The specific extension to allow e.g. minss would look something like this
Now try doing the same for e.g. movsxd, and you will have to be careful with the ModR/M byte, due to the VEX prefix changing semantics.