(no title)
DarkShikari | 13 years ago
In return, you are stuck with an extremely ugly syntax and a much less functional preprocessor, with the added bonus of a compiler that mangles your code.
DarkShikari | 13 years ago
In return, you are stuck with an extremely ugly syntax and a much less functional preprocessor, with the added bonus of a compiler that mangles your code.
Scaevolus|13 years ago
Do any production compilers schedule instructions to maximize superscalar performance?
kevinnk|13 years ago
jedbrown|13 years ago
DarkShikari|13 years ago
The pain of not having a proper macro assembler in C intrinsics is orders of magnitude worse than having to do my own register allocation in yasm, so for now, yasm is the lesser of two evils.
_ihaque|13 years ago
In my (admittedly limited) experience [1], the compiler has actually done pretty decently at optimizing register allocation in intrinsic-heavy loops. I wrote out the assembly loop in [2] with manual allocation into all 16 XMMs and then noticed the compiler managed to optimize 1 of them out.
[1] https://github.com/simtk/IRMSD
[2] https://github.com/SimTk/IRMSD/blob/master/python/IRMSD/theo...