I love Backblaze for their drive stats posts, but has anyone else run into multiple annoyances with their main services?
FWIW I am a mostly-happy customer for multiple years, and I always seem to be able to find files that I am looking for in my backups.
However, I've run into the following issues on _multiple Windows PCs_:
- `bzfilelist.exe` takes up 100% disk constantly on my SSD. I have to kill it anytime I want to do anything remotely disk-intensive (such as running a game or editing video). Not sure if this is a workaround to avoid having to watch the filesystem, but it is an annoyance. Changing the process priority can help in some cases, but usually I have to kill it to get acceptable performance in other applications. Also not sure if this is good for the SSD's health.
- Certain folders are hardcoded to be excluded. Standard folders such as `C:/Windows` make sense, but it also hardcodes excluding _all of program files_. This means that Steam game files are not automatically backed up (yes, this is not an ideal spot for Steam to store files, but it's how it has done it for a long, long time). I had to hack together a powershell script that periodically copies my files to another drive so they can be backed up by Backblaze. "Automatic backups" eh?
- Despite the above, it _does_ automatically back up Chrome's temporary files, which means a ton of churn as I browse throughout the day. I can manually exclude a path, but if they're going to be draconian about hardcoding paths to not back up, why not hardcode Chrome's app data too?
In addition, I have played around with their B2 service, an "S3-compatible" offering, and run into issues there as well:
- The S3-compatible hosting is not at parity with S3. There is no index.html autodetection, so I had to hack together an nginx proxy to assume `index.html` files at directories.
- The S3-compatible web hosting is extremely slow. It's a common pattern to want to host a periodically-updated static site out of an S3 bucket, but I have found the speed to be insufficient for even basic HTML in B2.
- I was unable to get their provided tooling for backing up local files to B2 working on a local Raspberry Pi. I ended up rsyncing to a different service.
In my experience B2 Cloud hosting is at least a magnitude slower for larger files and 2-5x slower for smaller files. If you need to do lots of look ups, forget about it. I'm mostly using it as a sort of archival system. Storing data that doesn't need to get there quickly. It's a very small use case. Even for larger server backups, its way too slow. Oh also they don't allow you to change regions, or have region specific buckets. You are locked into it with your account creation.
Their prices are so low that their real price is the $ cost and this slowness. So I use a mix of B2 and S3 depending on the use case.
I haven't tried hosting things in B2 - But I've been backing up to B2 via Duplicati and it has yet to have a single hiccup and the pricing is great. I have around 190G on there updated every night and I pay around $0.90/month - truly less than a cup of coffee.
HGST is still the best. But looking at this data doesn't actually give me confidence, it makes me even more paranoid about my Data.
We still dont have some cheap Data Storage that last for a life time. My friend were showing me photos of her daughter from 10 years ago and most of them had it "backup" on a DVD. I couldn't quite bring myself to explain to her about DVD life time.
I recall years ago a conversation about national archives type groups running into the problem of the amount of data divided by the transfer rate being either longer than the lifetime of the media, or longer than the refresh window.
In the latter case you are going straight from floppy to DVD, skipping over CD. In the former case you are doing data recovery operations on aging CDR media. It’s a mess.
And it makes me wonder about the relative speed of disk duplication solutions versus backup solutions. That is to say, if I can produce 3 backups in roughly the same time as one, maybe I should be making mirrored data transfers. And then shipping one to <atmospherically stable location> And another to <politically stable location>.
Hm. I'm of a different opinion based on decades long observations/experience. For one, the environment they seem to operate in is toxic. As seems to be the general case in data-centers, not only their special pods. This has been impressively demonstrated by Brendan Gregg - Shouting in the Datacenter https://www.youtube.com/watch?v=tDacjrSCeq4
Besides that I'd guess the electricity there isn't that 'clean' either.
At small scale/SOHO-use similar factors apply. If you pack them in the densest possible way for your data-hoarding needs in systems with shoddy power supply units you are asking for trouble. Similar things may apply to shiny network attached storage gadgets from the common brands.
In addition your body walking/stomping on flexible
grounds makes them shake, as do wobbly desks. So put them in BIG and well ventilated cases with good power supplies, and put these where they don't shake. Or on dampers, like 2 tennis balls cut in half for 4 dampers.
I think hard disk drives, tapes and cds have a pretty good offline life, and should last a lifetime if you, like me, are not planning on surviving to advanced age.
1% annual failure rates don't seem that bad to me. Historically, drive failures were closer to 5%. The 14TB Seagates with 5% failure are explained as follows:
> The 14TB Seagate (ST14000NM0138) drives have an AFR of 5.55% for Q2 2021. These Seagate drives along with 14TB Toshiba drives (MG07ACA14TEY) were installed in Dell storage servers deployed in our U.S. West region about six months ago. We are actively working with Dell to determine the root cause of this elevated failure rate and expect to follow up on this topic in the next quarterly drive stats report.
It seems like a "Bathtub curve" issue. The Seagates have very low months of use, but have failed at high numbers. It could be that Dell messed up the quality control. Or maybe it is Seagate.
[+] [-] nja|4 years ago|reply
FWIW I am a mostly-happy customer for multiple years, and I always seem to be able to find files that I am looking for in my backups.
However, I've run into the following issues on _multiple Windows PCs_:
- `bzfilelist.exe` takes up 100% disk constantly on my SSD. I have to kill it anytime I want to do anything remotely disk-intensive (such as running a game or editing video). Not sure if this is a workaround to avoid having to watch the filesystem, but it is an annoyance. Changing the process priority can help in some cases, but usually I have to kill it to get acceptable performance in other applications. Also not sure if this is good for the SSD's health.
- Certain folders are hardcoded to be excluded. Standard folders such as `C:/Windows` make sense, but it also hardcodes excluding _all of program files_. This means that Steam game files are not automatically backed up (yes, this is not an ideal spot for Steam to store files, but it's how it has done it for a long, long time). I had to hack together a powershell script that periodically copies my files to another drive so they can be backed up by Backblaze. "Automatic backups" eh?
- Despite the above, it _does_ automatically back up Chrome's temporary files, which means a ton of churn as I browse throughout the day. I can manually exclude a path, but if they're going to be draconian about hardcoding paths to not back up, why not hardcode Chrome's app data too?
In addition, I have played around with their B2 service, an "S3-compatible" offering, and run into issues there as well:
- The S3-compatible hosting is not at parity with S3. There is no index.html autodetection, so I had to hack together an nginx proxy to assume `index.html` files at directories.
- The S3-compatible web hosting is extremely slow. It's a common pattern to want to host a periodically-updated static site out of an S3 bucket, but I have found the speed to be insufficient for even basic HTML in B2.
- I was unable to get their provided tooling for backing up local files to B2 working on a local Raspberry Pi. I ended up rsyncing to a different service.
Am I the only one with these problems?
[+] [-] intev|4 years ago|reply
Their prices are so low that their real price is the $ cost and this slowness. So I use a mix of B2 and S3 depending on the use case.
[+] [-] quaffapint|4 years ago|reply
[+] [-] ksec|4 years ago|reply
We still dont have some cheap Data Storage that last for a life time. My friend were showing me photos of her daughter from 10 years ago and most of them had it "backup" on a DVD. I couldn't quite bring myself to explain to her about DVD life time.
[+] [-] hinkley|4 years ago|reply
In the latter case you are going straight from floppy to DVD, skipping over CD. In the former case you are doing data recovery operations on aging CDR media. It’s a mess.
And it makes me wonder about the relative speed of disk duplication solutions versus backup solutions. That is to say, if I can produce 3 backups in roughly the same time as one, maybe I should be making mirrored data transfers. And then shipping one to <atmospherically stable location> And another to <politically stable location>.
[+] [-] LargoLasskhyfv|4 years ago|reply
Besides that I'd guess the electricity there isn't that 'clean' either.
At small scale/SOHO-use similar factors apply. If you pack them in the densest possible way for your data-hoarding needs in systems with shoddy power supply units you are asking for trouble. Similar things may apply to shiny network attached storage gadgets from the common brands.
In addition your body walking/stomping on flexible grounds makes them shake, as do wobbly desks. So put them in BIG and well ventilated cases with good power supplies, and put these where they don't shake. Or on dampers, like 2 tennis balls cut in half for 4 dampers.
I know, it's inconvenient, but that's how it is.
[+] [-] LargoLasskhyfv|4 years ago|reply
[+] [-] arpa|4 years ago|reply
[+] [-] unknown|4 years ago|reply
[deleted]
[+] [-] bufferoverflow|4 years ago|reply
Seagate - to be avoided.
[+] [-] dragontamer|4 years ago|reply
> The 14TB Seagate (ST14000NM0138) drives have an AFR of 5.55% for Q2 2021. These Seagate drives along with 14TB Toshiba drives (MG07ACA14TEY) were installed in Dell storage servers deployed in our U.S. West region about six months ago. We are actively working with Dell to determine the root cause of this elevated failure rate and expect to follow up on this topic in the next quarterly drive stats report.
It seems like a "Bathtub curve" issue. The Seagates have very low months of use, but have failed at high numbers. It could be that Dell messed up the quality control. Or maybe it is Seagate.
[+] [-] arpa|4 years ago|reply
[+] [-] lapp0|4 years ago|reply
[deleted]