top | item 23889345

(no title)

stephenroller | 5 years ago

No, they aren't releasing the weights. They are releasing it as ML as a service. Right now it's in free beta, but it will open up for commercial usage in the future.

On another note:

At 175B parameters, with float16 representations, the in memory footprint is about 350GB plus activations would take it to another 400GB. You would need 12 or 13 V100GB GPUs to hold it in memory, or three p3.8xlarge. Meaning loading it on AWS would cost around $35-40/hr.

Though if you didn't care about speed, you could load up the weights from disk one at a time and forward through it a few layers at a time on a single GPU.

discuss

order

freeqaz|5 years ago

$35-40 an hour is well within the range of a "that sounds fun to grab my friends and mess around with it for a few hours on the weekend" budget!

Especially if you can use spot instances or a cheaper cloud host.

But I guess without the weights, the floor for this is several thousand dollars to play around with.

Do you know if the data set is being released?