top | item 31385790

(no title)

kolbusa | 3 years ago

My guess would be that this is because of thread migration.

(After reading the TFA: that's what Agner says right there in the 4th paragraph.)

(After re-reading the comment: I guess that the OS changes would need to be extensive with little to no benefit: running AVX2 on all cores will likely be faster than running 2 P cores with AVX512. The only thing that is really affected is the code that could use AVX512_FP16, but I doubt there's a lot of it outside of Intel.)

discuss

kijiki|3 years ago

> I guess that the OS changes would need to be extensive

I don't think that is true. In the simplest case, you could modify the #UD handler to notice when the fault is caused by an AVX512 instruction running on an E-core, and then simply and pin the process to the P-cores, migrate the process, and continue. All existing scheduler functionality.

> The only thing that is really affected is the code that could use AVX512_FP16, but I doubt there's a lot of it outside of Intel.

AVX512 is a lot more than just extending the vector width, and that extended functionality can be very useful for quickly emulating other CPU's vector instruction sets.