(no title)
DarkShikari | 13 years ago
If you want to test without a physical Haswell, the Intel Software Development Emulator should work okay, albeit somewhat slowly. I'd post overall numbers for real Haswells, but Intel has apparently said we can't do that yet.
Regarding FMA, FMA3/4 are floating point only. Since x264 has just one floating point assembly function, only two FMA3/FMA4 instructions get used in all of x264 (not counting duplicates from different-architecture versions of the function). An FMA4 version has been included for a while; the new AVX2 version does include FMA3, but of course that won't run on AMD CPUs (yet).
XOP had some integer FMA instructions, but I generally didn't find them that useful (there's a few places I found they could be slotted in, though).
jamesaguilar|13 years ago
Note: I'm not trying to question your engineering chops, just trying to correct my own misconceptions.
DarkShikari|13 years ago
pjmlp|13 years ago
Those are not C code, rather inline assembly or compiler intrisics, nothing of which has anything to do with C.