top | item 41852476

(no title)

trsohmers | 1 year ago

Do you think that the 16k GPUs get used once and then are thrown away? Llama 405B was trained over 56 days on the 16k GPUs; if I round that up to 60 days and assume the current mainstream hourly rate of $2/H100/hour from the Neoclouds (which are obviously making margin), that comes out to a total cost of ~$47M. Obviously Meta is training a lot of models using their GPU equipment, and would expect it to be in service for at least 3 years, and their cost is obviously less than what the public pricing on clouds is.

discuss

order

lossolo|1 year ago

And Meta is using a lot of GPUs for offline ML and online ML features on Instagram, FB etc. So nothing is "wasted".