top | item 21740199

(no title)

andrekorol | 6 years ago

The game's GitHub page[1] states that you would need a "beefy" GPU ~12 GB and CUDA to play the game locally.

I think that's why the author was serving the game through Colab since the majority of users probably don't have a 12GB GPU.

[1]https://github.com/AIDungeon/AIDungeon/

discuss

order

klingonopera|6 years ago

Ooof, yes, now I see where the $10K/day is coming from...

As I have demonstrated, I've really not much of a clue when it comes to AI, but do users really need 12Gb GPU RAM, 100% of the time? Maybe it's possible to use one GPU for multiple users?

dodobirdlord|6 years ago

Google Colab gives each user a dedicated Nvidia Tesla K80 GPU for 12 hours for free, which is super cool and presumably why the project is on Colab. But as each user spins up their own Colab instance it pulls down the 6GB of GPT-2 model weights, incurring 30-40 cents of data egress charges against the GCP Storage Bucket that the data is stored in.

60k users yesterday * 6GB each -> 360TB of data egress!

Normally, a scenario like this wouldn't involve bandwidth costs because GCP -> GCP same-region bandwidth is free, but Colab is technically not part of GCP, so the bandwidth charge is being assessed as egress to the public internet, which is pricy for that much data. Though it's probably still a lot cheaper than paying for the GPU-hours for that many users.

andrekorol|6 years ago

The $10K/day was actually coming from the large egress fees they were getting for transferring the models and agents from Google Cloud Storage to the Colab notebooks. I think if you were to serve the game as a web app you definitely wouldn't need one instance of a 12 GB GPU for each user. But the thing about Colab is that you need a Google account to use it, and you run your own notebook, independent from the author's account.

sanxiyn|6 years ago

$10k/day is just for file transfer, not for GPU.

nootropicat|6 years ago

I played it on a cpu. I downloaded it to remove the profanity filter. It's several seconds for each answer, so it's not ideal, but workable.