top | item 40665435

(no title)

idkdotcom | 1 year ago

These seem classic challenges with running distributed systems loads that are not specific to training LLMs.

Anyone of the super computers listed here https://en.wikipedia.org/wiki/TOP500 suffers from the same issues.

Think about it. While the national labs use these systems to model serious stuff -such as climate or nuclear weapons- Meta uses them to train LLMs. What a joke, honestly!

discuss

order

mhandley|1 year ago

On the other hand, Meta just rapidly built two different training networks in existing datacenter buildings, with existing cooling constraints, using mostly commodity components (albeit expensive commodity components) each of which would place at #3 on that top500 list in terms of GPU power. Compare that with how long it took to get any of the other supercomputers from design to being fully commissioned.

_zoltan_|1 year ago

For profit is not less serious than what research labs do. I'd even say it's more important: they drive the economy.

whiplash451|1 year ago

A lot of serious things look like a toy or a joke at first.