top | item 43636021

(no title)

claytonjy | 10 months ago

Because of the TPUs, or due to other factors?

What even is an AI data center? are the GPU/TPU boxes in a different building than the others?

discuss

Lots of other factors. I suspect this is one of the reasons why Google cannot offer TPU hardware itself out of their cloud service. A significant chunk of TPU efficiency can be attributed external factors which customers cannot easily replicate.

xnx|10 months ago

> Because of the TPUs, or due to other factors?

Google does many pieces of the data center better. Google TPUs use 3D torus networking and are liquid cooled.

> What even is an AI data center?

Being newer, AI installations have more variations/innovation than traditional data centers. Google's competitors have not yet adopted all of Google's advances.

> are the GPU/TPU boxes in a different building than the others?

Not that I've read. They are definitely bringing on new data centers, but I don't know if they are initially designed for pure-AI workloads.

nsteel|10 months ago

Wouldn't a 3d torus network have horrible performance with 9,216 nodes? And really horrible latency? I'd have assumed traditional spine-leaf would do better. But I must be wrong as they're claiming their latency is great here. Of course, they provide zero actual evidence of that.

And I'll echo, what even is an AI data center, because we're still none the wiser.