That was literally my question. Is this basically just for more datacenters, NVidia chips, and electricity with a sprinkling of engineers to run it all? If so, then that $500bn should NOT be invested in today's tech, but instead in making more powerful and power efficient chips, IMO.
Nvidia and TSMC are already working on more powerful and efficient chips, but the physical limits to scaling mean lots more power is going to be used in each new generation of chips. They might improve by offering specific features such as FP4, but Moore's law is still dead.
I'll make a wild guess that they will be building data centers and maybe robotic labs. They are starting with 100B of committed by mostly Softbank, but probably not transacted yet, money.
> building new AI infrastructure for OpenAI in the United States
The carrot is probably something like - we will build enough compute to make a supper intelligence that will solve all the problems, ???, profit.
If we look at the processing requirements in nature, I think that the main trend in AI going forward is going to be doing more with less, not doing less with more, as the current scaling is going.
Thermodynamic neural networks may also basically turn everything on its ear, especially if we figure out how to scale them like NAND flash.
If anything, I would estimate that this is a space-race type effort to “win” the AI “wars”. In the short term, it might work. In the long term, it’s probably going to result in a massive glut in accelerated data center capacity.
The trend of technology is towards doing better than natural processes, not doing it 100000x less efficiently. I don’t think AI will be an exception.
If we look at what is -theoretically- possible using thermodynamic wells, with current model architectures, for instance, we could (theoretically) make a network that applies 1t parameters in something like 1cm2. It would use about 20watts, back of the napkin, and be able to generate a few thousand T/S.
Operational thermodynamic wells have already been demonstrated en silica. There are scaling challenges, cooling requirements, etc but AFAIK no theoretical roadblocks to scaling.
Obviously, the theoretical doesn’t translate to results, but it does correlate strongly with the trend.
So the real question is, what can we build that can only be done if there are hundreds of millions of NVIDIA GPUs sitting around idle in ten years? Or alternatively, if those systems are depreciated and available on secondary markets?
Reasonably speaking, there is no way they can know how they plan to invest $500 billion dollars. The current generation of large language models basically use all human text thats ever been created for the parameters... not really sure where you go after than using the same tech.
That's not really true - the current generation, as in "of the last three months", uses reinforcement learning to synthesize new training data for themselves: https://huggingface.co/deepseek-ai/DeepSeek-R1-Zero
The latest hype is around "agents", everyone will have agents to do things for them. The agents will incidentally collect real-time data on everything everyone uses them for. Presto! Tons of new training data. You are the product.
It seems to me you could generate a lot of fresh information from running every youtube video, every hour of TV on archive.org, every movie on the pirate bay -- do scene by scene image captioning + high quality whisper transcriptions (not whatever junk auto-transcription YouTube has applied), and use that to produce screenplays of everything anyone has ever seen.
I'm not sure why I've never heard of this being done, it would be a good use of GPUs in between training runs.
burnte|1 year ago
kristianp|1 year ago
bitmasher9|1 year ago
bdangubic|1 year ago
Havoc|1 year ago
patall|1 year ago
TrainedMonkey|1 year ago
> building new AI infrastructure for OpenAI in the United States
The carrot is probably something like - we will build enough compute to make a supper intelligence that will solve all the problems, ???, profit.
K0balt|1 year ago
Thermodynamic neural networks may also basically turn everything on its ear, especially if we figure out how to scale them like NAND flash.
If anything, I would estimate that this is a space-race type effort to “win” the AI “wars”. In the short term, it might work. In the long term, it’s probably going to result in a massive glut in accelerated data center capacity.
The trend of technology is towards doing better than natural processes, not doing it 100000x less efficiently. I don’t think AI will be an exception.
If we look at what is -theoretically- possible using thermodynamic wells, with current model architectures, for instance, we could (theoretically) make a network that applies 1t parameters in something like 1cm2. It would use about 20watts, back of the napkin, and be able to generate a few thousand T/S.
Operational thermodynamic wells have already been demonstrated en silica. There are scaling challenges, cooling requirements, etc but AFAIK no theoretical roadblocks to scaling.
Obviously, the theoretical doesn’t translate to results, but it does correlate strongly with the trend.
So the real question is, what can we build that can only be done if there are hundreds of millions of NVIDIA GPUs sitting around idle in ten years? Or alternatively, if those systems are depreciated and available on secondary markets?
What does that look like?
unknown|1 year ago
[deleted]
disambiguation|1 year ago
unknown|1 year ago
[deleted]
croddin|1 year ago
https://x.com/sama/status/1756090136935416039
jppope|1 year ago
Philpax|1 year ago
rapjr9|1 year ago
cavisne|1 year ago
jazzyjackson|1 year ago
I'm not sure why I've never heard of this being done, it would be a good use of GPUs in between training runs.
riku_iki|1 year ago
layer8|1 year ago
lukeplato|1 year ago
unknown|1 year ago
[deleted]
unknown|1 year ago
[deleted]
HarHarVeryFunny|1 year ago
MangoCoffee|1 year ago
mrandish|1 year ago
paulnpace|1 year ago