(no title)
edude03 | 5 months ago
Arguably that's a GPU? Other than (currently) exotic ways to run LLMs like photonics or giant SRAM tiles there isn't a device that's better at inference than GPUs and they have the benefit that they can be used for training as well. You need the same amount of memory and the same ability to do math as fast as possible whether its inference or training.
CharlesW|5 months ago
Yes, and to @quadrature's point, NVIDIA is creating GPUs explicitly focused on inference, like the Rubin CPX: https://www.tomshardware.com/pc-components/gpus/nvidias-new-...
"…the company announced its approach to solving that problem with its Rubin CPX— Content Phase aXcelerator — that will sit next to Rubin GPUs and Vera CPUs to accelerate specific workloads."
edude03|5 months ago
In fact - I'd say we're looking at this backwards - GPUs used to be the thing that did math fast and put the result into a buffer where something else could draw it to a screen. Now a "GPU" is still a thing that does math fast, but now sometimes, you don't include the hardware to put the pixels on a screen.
So maybe - CPX is "just" a GPU but with more generic naming that aligns with its use cases.
imtringued|5 months ago
And no, the NPU isn't a GPU.
edude03|5 months ago
AzN1337c0d3r|5 months ago
Similarly, Tenstorrent seems to be building something that you could consider "better", at least insofar that the goal is to be open.
nsteel|5 months ago
https://www.etched.com/announcing-etched
quadrature|5 months ago
conradev|5 months ago