(no title)
Chamix | 2 years ago
Cost aside, HMB dies themselves aren't getting significantly denser anytime soon, and there just simply isn't enough package space with current manufacturing methods to pack a significantly increased number of dies on the gpu.
So I suspect the major hardware jumps will continue to be with NVLink/NVSwitch. Nvlink 4 + NVSwitch 3 actually already allows for up 256x GPUs https://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-ho... ; increased numbers of links will let ever increasing numbers of GPUs pool with sufficient bandwidth for inference on larger models.
As already mentioned, see this HN post about the GH200 https://news.ycombinator.com/item?id=36133226, which has some further discussion about the cutting edge of bandwidth for Nvidia DGX and Google TPU pods.
hesdeadjim|2 years ago