top | item 46641311

(no title)

Fascinating! So each GPU is partnered with disk and NICs such that theres no oversubscription for bandwidth within its 'slice'? (idk what the word is) And each of these 8 slices wire up to NVLink back to the host?

Feels like theres some amount of (software) orchestration for making data sit on the right drives or traverse the right NICs, guess I never really thought about the complexity of this kind of scale.

I googled GB200, its cool that Nvidia sells you a unit rather than expecting you to DIY PC yourself.

discuss

_zoltan_|1 month ago

usually it's 2-2-2 (2 GPUs, 2 NICs and 2 NVMe drivers on a PCIe complex). no NVLink here, this is just PCIe - under this PCIe switch chip there is full bandwidth, above it's usually limited BW. so for example going GPU-to-GPU over PCIe will walk

GPU -> PCIe switch -> PCIe switch (most likely the CPU, with limited bw) -> PCIe switch -> GPU

NVLink comes into the picture as a separate, 2nd link between the GPUs: if you need to do GPU-to-GPU, you can use NVLink.

you never needed to DIY your stuff, at least not for the last 10 years: most hardware vendors (Supermicro, Dell, ...) will sell you a complete system with 8 GPUs.

what's nice on GH200/GBx00/VR systems, is that you can use chip-to-chip NVLink between the CPU and GPU, so the CPU can access GPU memory coherently and vica versa.