Indeed, you can strip out a whole host of things from the GPU, the framebuffer, the Z-buffer, the transform and lighting engine, instead filling it with more CUDA cores and a higher bandwidth memory controller with a larger bus, etc.
It's already there. Have you seen the six figure AI chips that Nvidia is selling to the data center customers? Those chips are no GPUs, they can't draw a single triangle or map a single texture, they're AI accelerators all the way. People still think Nvidia is selling gaming GPUs for AI workloads like it's 2018?
Google, Meta, et-all are working on their own AI chips but those chips will have to beat Nvidia's at Performance and TCO and Nvidia shows no signs of slowing down to let competitors catch up.
The chips are optimised for matmuls, but not for transformer architecture per se. With dedicated ASICS, and weights hardcoded (or stored in SRAM) we could theorically get 1 token per one cycle - so millions/billions of tokens per second, not hundreds.
Etched, for example claims they have a chip reaching 500k tok/s in the works. Which is still far from the theoretical max with the current techology.
A similar scenario went with Bitcoin's GPU/FPGA/ASIC - the current ASICs are millions of times faster than GPUs.
TCO, yes. Raw performance, not necessarily. TCO will attack NVDA's margins. When Meta last wrote about their cluster it was presented as power equivalent to X NVDA chips. They are already bringing their own chips into the mix.
With Bitcoin I feel like it’s different, since the hashing algorithm would only ever change during a fork. This is rare in that it only ever happens every few years.
With AI, we’re constantly training different models, which can’t be trained using asics. If we ever get to the point where we no longer need to train new models, then yeah, it will go the way of bitcoin.
> With Bitcoin I feel like it’s different, since the hashing algorithm would only ever change during a fork. This is rare in that it only ever happens every few years.
Wait what!? Did the Bitcoin hashing algorithm ever change?
The problem with this comparison is Bitcoin has basically just been SHA256 for 15 years and likely will continue to be for some time.
Transformers have been mostly dominant for at least several years but there are still other archs (CNN, RNN, etc) in various use-cases and we're already seeing nearly-fundamental changes in Transformers and "emerging" approaches like Mamba, RWKV, hybrids, etc. Transformers have shown remarkable versatility and adaptability (that's their whole thing) but it's already creaking and showing its age.
Startups building Transformer-specific silicon are playing a very risky game that is already somewhat problematic now and almost certainly won't end well.
AI is much newer, much more vast, and moving much more quickly. The ASIC design, tape out, manufacture, software ecosystem, actually getting to market, etc cycle is fundamentally too long and I suspect even the Transformer-specific silicon we see now will be viewed as a major blunder in the relatively near future:
"Oh yeah, remember those graveyard companies that did transformer silicon back in the first AI hype round?"
I cannot see how anything other than GPGPU, TPU, NPU, etc (or similar "generic" approaches) will have legs.
glitchc|1 year ago
And, as it happens, that's exactly what NVidia's done with the H100: https://developer.nvidia.com/blog/nvidia-hopper-architecture...
It still needs to be programmable though. Can't get away from that.
kolinko|1 year ago
adastra22|1 year ago
Rinzler89|1 year ago
Google, Meta, et-all are working on their own AI chips but those chips will have to beat Nvidia's at Performance and TCO and Nvidia shows no signs of slowing down to let competitors catch up.
kolinko|1 year ago
Etched, for example claims they have a chip reaching 500k tok/s in the works. Which is still far from the theoretical max with the current techology.
A similar scenario went with Bitcoin's GPU/FPGA/ASIC - the current ASICs are millions of times faster than GPUs.
matwood|1 year ago
TCO, yes. Raw performance, not necessarily. TCO will attack NVDA's margins. When Meta last wrote about their cluster it was presented as power equivalent to X NVDA chips. They are already bringing their own chips into the mix.
posix_compliant|1 year ago
With AI, we’re constantly training different models, which can’t be trained using asics. If we ever get to the point where we no longer need to train new models, then yeah, it will go the way of bitcoin.
TacticalCoder|1 year ago
Wait what!? Did the Bitcoin hashing algorithm ever change?
kkielhofner|1 year ago
The problem with this comparison is Bitcoin has basically just been SHA256 for 15 years and likely will continue to be for some time.
Transformers have been mostly dominant for at least several years but there are still other archs (CNN, RNN, etc) in various use-cases and we're already seeing nearly-fundamental changes in Transformers and "emerging" approaches like Mamba, RWKV, hybrids, etc. Transformers have shown remarkable versatility and adaptability (that's their whole thing) but it's already creaking and showing its age.
Startups building Transformer-specific silicon are playing a very risky game that is already somewhat problematic now and almost certainly won't end well.
AI is much newer, much more vast, and moving much more quickly. The ASIC design, tape out, manufacture, software ecosystem, actually getting to market, etc cycle is fundamentally too long and I suspect even the Transformer-specific silicon we see now will be viewed as a major blunder in the relatively near future:
"Oh yeah, remember those graveyard companies that did transformer silicon back in the first AI hype round?"
I cannot see how anything other than GPGPU, TPU, NPU, etc (or similar "generic" approaches) will have legs.
brrrrrm|1 year ago
vidarh|1 year ago
danielmarkbruce|1 year ago