top | item 36533945

(no title)

fsfod | 2 years ago

At least AMD kept AVX-512 in there small Zen4C cores and just sacrificed some cache instead. I have to wonder if it was intel marketing that killed off working AVX-512 in consumer P-cores after they were released because the E-cores just become dead weight with AVX-512 enabled.

discuss

order

paulmd|2 years ago

nobody really knows what is going on with AVX-512 inside intel.

Linus has commented that it's trivial to trap the first time an AVX instruction is used and pin it to a P-core (they used to do this as an optimization to avoid saving/restoring AVX registers in non-AVX code) and he doesn't know why that patch isn't on his desk.

The other problem is presumably that CPUID depends on what core it's executed on but... it seems straightforward for code to just run a CPUID on every single core (using affinity) and analyze the results. OK, 16 AVX-512 threads and 8 AVX2 threads, that's fine! It is obviously not the way code is currently written but code isn't written for AVX-512 right now anyway, and it should be literally an hour of work for a C dev.

I guess maybe they just didn't want the long tail of support but with their future architectures being heterogeneous too, they don't seem to have any plan either, and they're still shipping the AVX-512 units in silicon, meaning they are paying millions of dollars for a feature that isn't enabled. Very, very weird.

Fun fact, Alder can actually be run with AVX-512 enabled even with E-cores active. There is an undocumented MSR flag that seems to allow this. Has to be the stepping/BIOS revision before it was locked out though.

colejohnson66|2 years ago

> he doesn't know why that patch isn't on his desk

Because then anytime a program links to a vector-supporting library such as glibc, there’s a high chance they’ll be pinned to a P-core when an E-core and AVX2 would’ve sufficed.