Not knowing much about special-purpose chips, I would like to understand whether chips like this would give Google a significant cost advantage over the likes of Anthropic or OpenAI when offering LLM services. Is similar technology available to Google's competitors?
heymijo|10 months ago
Why?
For each new word a transformer generates it has to move the entire set of model weights from memory to compute units. For a 70 billion parameter model with 16-bit weights that requires moving approximately 140 gigabytes of data to generate just a single word.
GPUs have off-chip memory. That means a GPU has to push data across a chip - memory bridge for every single word it creates. This architectural choice, is an advantage for graphics processing where large amounts of data needs to be stored but not necessarily accessed as rapidly for every single computation. It's a liability in inference where quick and frequent data access is critical.
Listening to Andrew Feldman of Cerebras [0] is what helped me grok the differences. Caveat, he is a founder/CEO of a company that sells hardware for AI inference, so the guy is talking his book.
[0] https://www.youtube.com/watch?v=MW9vwF7TUI8&list=PLnJFlI3aIN...
latchkey|10 months ago
I wish I could say more about what AMD is doing in this space, but keep an eye on their MI4xx line.
ein0p|10 months ago
hanska|10 months ago
https://www.youtube.com/watch?v=xBMRL_7msjY
pkaye|10 months ago
https://www.datacenterknowledge.com/data-center-chips/ai-sta...
https://www.semafor.com/article/12/03/2024/amazon-announces-...
avrionov|10 months ago
kccqzy|10 months ago
xnx|10 months ago
claytonjy|10 months ago
What even is an AI data center? are the GPU/TPU boxes in a different building than the others?
cavisne|10 months ago
No one else has access to anything similar, Amazon is just starting to scale their Trainium chip.
buildbot|10 months ago
baby_souffle|10 months ago
monocasa|10 months ago
The end of Moore's law pretty much dictates specialization, it's just more apparent in fields without as much ossification first.