top | item 35391976

(no title)

sacred_numbers | 2 years ago

If you bought an 8xA100 machine for $140k you would have to run it continuously for over 10,000 hours (about 14 months) to train the 7B model. By that time the value of the A100s you bought would have depreciated substantially; especially because cloud companies will be renting/selling A100s at a discount as they bring H100s online. It might still be worth it, but it's not a home run.

discuss

order

inciampati|2 years ago

If 8-bit training methods take off, I think the calculus is going to change rapidly, with newer cards that have decent amounts of memory and 8-bit acceleration starting to become dramatically more cost and time effective than the venerable A100s.