top | item 42137757

(no title)

esdf | 1 year ago

Hosting large datasets can be expensive but the hosting for the danbooru datasets was not. It's "only" a few terabytes in size. A previous release was 3.4TB, so the latest is probably some hundreds of GB, to a TB~, in size larger. The download was hosted on a hetzner IP, which is a provider known for cheap servers. You can pay them $50/m for a server with "unmetered" 1gigabit up/down network + 16TB of disks. $600 a year would not be difficult.

discuss

order

gwern|1 year ago

I think people tend to wildly overestimate how expensive hosting a large dataset has to be, because of cloud.

If you only need a few terabytes, you can rent from Hetzner for more like <$20/month. Maybe <$10 at this point, if you are patient. And you can use the server for other things too, since it's a fixed cost, like your website or encrypted backups or other files you host. I spent a while figuring this out before I decided to do Danbooru20xx, to make sure it was very cheap. Given how my website has grown in size, and how absurdly exorbitant cloud bandwidth and object/hard drive space is, the Danbooru dataset was practically free at the margin!

> Also, I thought these lesswrong folks were all about "effective altruism" and "earning to give" and that stuff.

Some people are, some people aren't. I'm not. (I have intellectual sympathy with EA, but not emotional.)

itake|1 year ago

$600 a year is a lot of money..