(no title)
timlatim | 1 year ago
I think the TDP on the 9700X and 9600X may have been set a bit too low (in fact, there are indications it will be raised in a future BIOS update [1]), which led to a relatively cool reception from reviewers focused on raw performance. When looking at performance-per-watt in Phoronix tests, 9700X and 9600X often fare better than the bigger chips with higher TDP, but for desktops I guess efficiency is just not that big of a concern.
[1] https://videocardz.com/newz/amd-set-to-boost-tdp-for-ryzen-5...
kimixa|1 year ago
It'll be interesting to see if it remains niche - I do a fair bit of work on graphics rendering (some games, some not) and there's quite a bit in avx512 that interests me - even ignoring the wider register width. A lot of pretty common algorithms we use can be expressed a fair bit easier and simpler using some of those features.
Previous implementations either weren't available on consumer platforms, or had issues where they would downclock/limit ALU calculation width for some time after an avx512 instruction was run, only returning to full speed after a significant time - presumably when whatever power delivery issues could settle - which seriously affected what use cases in which it made sense. It wasn't worth it to have "small data set" users of avx512, as it would actually run slower than the equivalent avx2 code due to this. And the size of "large enough" data sets was pretty close to where it'll be better to schedule a task on the GPU anyway....
But AMD's implementation doesn't seem to have this problem - so this opens up the instruction set to much more use cases than previous implementations.
Or has the AVX512 ship already sailed? With Intel apparently being unable to fix these issues and started hacking it into even smaller bits? I mean, arguably they should have started with that - the register width is probably the least interesting part to me, but at some point having it actually widely adopted might be more useful than a "possibly better" version that no chip actually supports.
Remnant44|1 year ago
Just as a small example from current code, the much more powerful AVX512 byte-granular two register source shuffles (vpermt2b) are very tempting for hashing/lookup table code, turning a current perf bottleneck into something that doesn't even show up in the profiler. And according to (http://www.numberworld.org/blogs/2024_8_7_zen5_avx512_teardo...) Zen5 has not one but _TWO_ of them, at a throughput quadrupling Intel's best effort..
kvemkon|1 year ago
From an article:
> Does Zen5 throttle under AVX512?
> Yes it does. Intel couldn't get away from this, and neither can AMD. Laws of physics are the laws of physics.
> The difference is how AMD does the throttling ...
Further details in the article [1].
Discussed here on HN: [2], [3].
[1] https://www.numberworld.org/blogs/2024_8_7_zen5_avx512_teard...
[2] https://news.ycombinator.com/item?id=41182395
[3] https://news.ycombinator.com/item?id=41248260
chipdart|1 year ago
rainclouds|1 year ago
timlatim|1 year ago