top | item 39378889

(no title)

jonathanlei | 2 years ago

Jonathan from TensorDock (https://tensordock.com/) here - we listed two of our A100 and H100 clusters on the site.

The IB equipped on our clusters (can't speak to others) is 8x 400 Gbps. Most customers training foundational models are able to fully utilize that fabric in parallel.

discuss

order

stonogo|2 years ago

Which HCAs are enabling that? You're using eight 4-link QSFPs here?, presuming this is NDR?

And out of curiosity, is aggregate bandwidth the normal marketing metric in this industry? In my neck of the woods this would be reported as an NDR400 system.