top | item 38405462

(no title)

jeffreyames | 2 years ago

10k H100 chips is considered a very large cluster. The third fastest supercomputer in the world is Microsoft’s eagle with 14k H100s https://www.top500.org/lists/top500/2023/11/

discuss

order

chollida1|2 years ago

Ah, gotcha, so the fact that its 10,000 chips for one dedicated cluster that makes it large, as opposed to Azure which has an order of magnitude more GPUS but rents many of those out.

jeffreyames|2 years ago

High performance on a single task requires simultaneous computation and communication between nodes. If there's high latency between nodes, such as between nodes in different data centers, the communication costs can't be masked by computation.

rightbyte|2 years ago

I guess Azure's are spread out too. Latency higher to world wide datacentres.