(no title)
daemontus | 3 years ago
Threadripper Pro exists and as far as I can tell, certain industries are practically begging AMD to keep updating it. A system with 32-core TR Pro and a few hundred gigs of RAM is easily $10k. With a 64-core CPU, you are probably closer to $15k. If a Mac Pro can give you 40-60 cores, 128-512GB of RAM plus a powerful GPU for $10-20k, its certainly niche, but not dead on arrival.
I think the main issue is that while AMD can "justify" TR Pro by having disaggregated compute and IO chiplets, Apple does not. AMD can just "print" as many CPU cores as it likes (within the power budget), and then slap on a different IO tile depending on what product it's making.
Currently, Apple isn't doing that. Furthermore, they put themselves into an even bigger corner by gluing their CPU and GPU onto one die. Most professional tasks are fine with a monster CPU and no/little GPU, or a monster GPU and an ok CPU. Adding PCIe with support for GPUs would help with that. For memory heavy tasks, you could add CXL memory expansions. But that then adds a new problem: what dies would Apple fab for this product?
With M1, there were essentially 2-3 dies: The power efficient M1, the M1 Max/Ultra with the interposer, and M1 Pro (there were rumours some Pros are just failed Max-es with a part of the chip cut off, but afaik nobody was able to verify this, so 3 dies it is). For M2, we have only seen the power efficient die so far. If M2 Extreme would be 4x M2 Max, then the previous strategy could work. But if Apple wants PCIe or CXL memory modules on M2 Extreme, they have to put it on all of their Max (and maybe Pro) dies as well. Even though these will be in laptops/all-in-ones that won't use any of that. In other words, a ton of analog circuitry that's a complete waste of silicon. The only other option is to fab special dies for M2 Extreme, which might have been the plan, and what probably made it a very bad ROI.
It's kind of like Sapphire Rapids: SR is certainly fast for what it's doing, but it's super expensive compared to EPYC and less scalable exactly because you need different dies to implement different SKUs (SR is monolithic up to afaik ~20 cores and the "chiplets" only appear above that, plus the IO you get depends on the chiplets). And SR doesn't even include a GPU...
Finally, Apple's GPUs don't appear to be really scaling as great as they were hoping. I have yet to see a real-world M1 Ultra review that wouldn't end on a "it's +20-50% more performance for +100% price" note. The CPU cluster seems to be doing much better. But by tying four of them together for M2 Extreme, people who need CPU would have to pay for the whole thing, and people who need GPU would be probably left disappointed.
My personal "best bet" would be to just keep releasing a power efficient "consumer" die, but disaggregate everything on the "pro level": A single CPU-heavy "Pro" die with a bit of GPU and a bit of IO, then a GPU-heavy die that slaps onto that to make the Max, and then an IO-only die for the Mac Pro mixed and matched with a combination of CPU and GPU dies. I wouldn't be shocked if this was the plan for M3/M4, the question is if the packaging technology will be there in time (and economical; wink wink Intel).
No comments yet.