top | item 46032759

(no title)

blurrybird | 3 months ago

AWS and Anthropic did this back in July: https://aws.amazon.com/blogs/containers/amazon-eks-enables-u...

discuss

order

cowsandmilk|3 months ago

That is 100k vs 130k for Google’s new announcement. I can’t speak as to whether the additional 30k presented new challenges though.

Cthulhu_|3 months ago

I want to believe that this is an order-of-magnitude kind of problem, that is, if 100K is fine then 500K is also fine.

I only skimmed the article though, but I'm confident that it's more a physical hardware, time, space and electricity problem than a software / orchestration one; the article mentions that a cluster that size needs to be multi-datacenter already given the sheer power requirements (2700 watts for one GPU in a single node).