top | item 19487201

(no title)

zoratu | 7 years ago

In that case, what do you use?

discuss

dmd|7 years ago

We're using Bacula, because in our case it's the only thing that works with our volume, and also works with tape (which is the only cost effective way we've found to archive multi petabyte datasets).

But I'm not very happy about it, because it's insanely overcomplicated for no really good reason, and because of it being file-based.

prirun|7 years ago

Backblaze has done some cost comparisons between LTO and cloud storage:

https://www.backblaze.com/blog/lto-versus-cloud-storage/

If you're interested, I'm doing experiments with another site with 500T to backup, where I added sampling and sharding to HashBackup (I'm the author).

Sampling allows you to do faster simulated backups to determine the best backup parameters to use. In his case, we determine that a very large block size - 64M - was the best way to backup his data.

Sharding automatically partitions the filesystem so multiple backup can run simultaneously to get backup speed in the 250-400 MB/s range.

It's more at the proof of concept stage, but having another large site to work with would be fantastic! A couple of the larger sites using HashBackup are EURAC (European Research Center) and HMDC (Harvard MIT Data Center)