R1 and O1 points towards a world where training models will be a small bucket of the overall compute. That doesn't mean the total amount of AI compute will stop accelerating, just that interconnected mega clusters is not the only or most efficient way to run a majority of future workloads. That should be negative news for the company that is currently the only one that is capable of making chips for these clusters, and positive news for the players that can run inference on a single chip, as they will be able to grab more parts of the compute pie.
No comments yet.