Not suprising that the hyperscalers will make this decision for inference and maybe even a large chunk of training. I wonder if it will spur nvidia to work on an inference only accelerator.
> I wonder if it will spur nvidia to work on an inference only accelerator.
Arguably that's a GPU? Other than (currently) exotic ways to run LLMs like photonics or giant SRAM tiles there isn't a device that's better at inference than GPUs and they have the benefit that they can be used for training as well. You need the same amount of memory and the same ability to do math as fast as possible whether its inference or training.
"…the company announced its approach to solving that problem with its Rubin CPX— Content Phase aXcelerator — that will sit next to Rubin GPUs and Vera CPUs to accelerate specific workloads."
I'm not very well versed, but i believe that training requires more memory to store intermediate computations so that you can calculate gradients for each layer.
They’re already optimizing GPU die area for LLM inference over other pursuits: the FP64 units in the latest Blackwell GPUs were greatly reduced and FP4 was added
edude03|4 months ago
Arguably that's a GPU? Other than (currently) exotic ways to run LLMs like photonics or giant SRAM tiles there isn't a device that's better at inference than GPUs and they have the benefit that they can be used for training as well. You need the same amount of memory and the same ability to do math as fast as possible whether its inference or training.
CharlesW|4 months ago
Yes, and to @quadrature's point, NVIDIA is creating GPUs explicitly focused on inference, like the Rubin CPX: https://www.tomshardware.com/pc-components/gpus/nvidias-new-...
"…the company announced its approach to solving that problem with its Rubin CPX— Content Phase aXcelerator — that will sit next to Rubin GPUs and Vera CPUs to accelerate specific workloads."
imtringued|4 months ago
And no, the NPU isn't a GPU.
AzN1337c0d3r|4 months ago
Similarly, Tenstorrent seems to be building something that you could consider "better", at least insofar that the goal is to be open.
nsteel|4 months ago
https://www.etched.com/announcing-etched
quadrature|4 months ago
conradev|4 months ago