top | item 21515084

Backblaze Hard Drive Stats Q3 2019

197 points| garaetjjte | 6 years ago |backblaze.com | reply

122 comments

order
[+] akersten|6 years ago|reply
These stats are wonderful and make me really appreciate the culture of the company. I've been considering becoming a customer because of these posts, since they reflect a lot of pride in the craft and care for the community.

But I'm stuck on one thing. Does Backblaze offer a solution for Linux backup? I've got an NFS server running that I use for home storage that I want to back up - but looks like Backblaze is only offering a Windows or Mac client.

Maybe the business version would work, since it claims to support NAS backup. But then the pricing seems lower than the personal edition (60$/computer/year = 5$/month < 6$/month) - unless that's implying that every computer that accesses the NAS is part of the fee?

So I guess: is there a reasonable Linux offering for home users from Backblaze? If not, what service do folks suggest?

[+] atYevP|6 years ago|reply
Yev here from Backblaze -> the Computer Backup service does not offer an unlimited Linux backup service. We do have support for Linux with Backblaze B2 Cloud Storage and our integration partners (https://www.backblaze.com/b2/integrations.html?platform=linu... - I filtered the list by Linux for you). On the business side the NAS backup is also done via our parnterships and B2. On the consumer end, we haven't found a way to make unlimited backup sustainable with NAS/Linux since those devices/platforms typically have WAY more data than the average user.
[+] seanlane|6 years ago|reply
I've been using restic with the Backblaze B2 backend for a home server backup, which seems to be as close as Backblaze will ever get to having a Linux client.

My rough numbers are 450GB stored monthly, 4GB downloaded monthly, 90k stored files, 385,000 individual transactions, which ends up costing about $2.25 in storage fees, $0.25 for transactions, and $0.10 for download bandwidth.

[+] icefo|6 years ago|reply
You can use backblaze B2 with duplicacy. I backup my machines to my NAS and the NAS itself to B2 using duplicacy. The free version is command line only (with text config files). It works really well and B2 prices are reasonable ! They have a billing estimator somewhere on their site.

I made a script to backup nested zfs volumes to B2 using duplicacy if anyone is interested : https://gist.github.com/icefo/07aab2789e5cfa71045343953aaf88... It makes a snapshot, backup that and handle unexpected network, power loss or backup that span longer than the cron interval gracefully.

[+] bretpiatt|6 years ago|reply
If you're a Linux home user and you want a second copy of your data somewhere else in the event of a system failure / flood / fire I'd consider scripting something to B2 (Backblaze's object storage), AWS Glacier, or other archive priced cloud services.

If you're looking for a backup service to handle point in time recovery, differential deduplication, or other features of a backup service (vs. a second copy of your data archive) those also exist though I don't have a clear recommendation for home users.

On Backblaze's business pricing my understanding is they require minimum of 5 users so that's where you'll see the difference solved.

[+] mosselman|6 years ago|reply
I also use B2 even though I am on Mac. I bought Arq to backup to B2 and I will break even within the first year. I used to pay for the personal edition before realising that my new setup is far cheaper.

I have used restic and rclone with very crappy platforms (Google Drive and Microsoft OneDrive) for testing purposes and that worked fine enough so I can imagine it works even better on something like B2.

[+] freedomben|6 years ago|reply
I also run Linux, and I use rclone[1] with Backblaze B2 (cloud storage). It works really well. I have it set on a nightly cron job locally. I also trigger it manually if I have something immediate (like photos transferred via adb from android that I want to ensure get backed up).

[1]: https://rclone.org/b2/

[+] e12e|6 years ago|reply
Excellent answers here. As can be surmised, the answer is for Linux, use b2.

But it just occurred to me, that for home use, if you're a special kind of masochist, you might expose your DAS to a vm running ReactOS and use the windows client?

I don't recommend it, and have no idea if it would work.. But would love to see a write-up if someone wanted to try it...

[+] techntoke|6 years ago|reply
No, but CrashPlan is $10 per month and has a Linux client that supports unlimited storage.
[+] Havoc|6 years ago|reply
>If not, what service do folks suggest?

O365 comes with 5tb of space that can be addressed by a diff backup tool like duplicati.

Only gotcha I can see is that it's 5x1tb

[+] nolok|6 years ago|reply
There was a post from BackBlaze a year (?) or so back where they commented on the Toshiba low failure rate with something along the lines of "they seem really reliable, but we buy in bulk and just don't have enough offers at low price for those, otherwise we would buy a lot of them".

Well I run a couple dozen Synology NAS in professionnal setup, as well as two in personnal setup (mine and my parents'), and ever since that post I made the experiment of having almost 50% of all drives be Toshibas, and I have to say they do seem to be much more reliable (on the scale of "why do every other drive from Seagate and WD keep dying first, and often their replacement dies first too").

It is still a scale of use where it's mostly anecdotical rather than verifiable data, so don't take this fun comment for more than that. But I suspect a lot of people reading these posts are not interested for some large scale setup or anything like that but rather to know which drives to put in their home computer or NAS, and honestly I can highly recommend the Toshiba for that. They do tend to be a bit more expensive (around 10% more ? I buy them from ldlc.com and grosbill.com , french IT stores, no bulk buying or anything like that)

Of course no matter the brand never expect no failure and a Toshiba drive may just as much die in the first ten minutes so always plan for it.

[+] atYevP|6 years ago|reply
Yev from Backblaze here -> Yea, the Toshiba drives are great! If the price was lower, they'd likely play a larger role in our hard drive mix!
[+] DanCarvajal|6 years ago|reply
This is my favorite kind of content marketing.
[+] piepoter|6 years ago|reply
Shout out to Seagate, every drive that my friends and I have bought from them have eventually failed, good to see that they fail in non-consumer use too, not just me! Stick to the Western Digitals.
[+] atYevP|6 years ago|reply
Yev here from Backblaze -> The Western Digital drives that you've purchased will fail too. That's part of the whole point of these reports, all of the drives eventually fail out or reach a state where we have to replace them - it's not any one specific manufacturer. That's part of why having a backup is so important, even the SSDs in newer machines will eventually go wonky.
[+] clhodapp|6 years ago|reply
As a teenager, I helped a friend revive a Seagate drive after it bricked due to faulty firmware. If I recall correctly, my friend had actually installed some firmware updates for the drive, but had not installed one recently enough to avoid the problem. We had to run wires to contacts on the board to allow us to run commands in a terminal (in Windows) on the only machine we could find with a serial port. When it worked, we felt like hackers from a movie!
[+] wil421|6 years ago|reply
Agreed. My seagate drives all died, even ones bundled in external usb drives.

I have 10 year old WD black and green drives still kicking. I still have a gen 1 or 2 intel ssd drive that’s still kicking.

[+] AnIdiotOnTheNet|6 years ago|reply
Dissenting opinion: I've owned and used a lot of Seagate drives. I have had a total of one fail on me. Given that they are usually less expensive they are then competing drives, I think they're fine. At the rate of disk size growth I end up replacing the disks for size reasons before they fail anyway.
[+] mywacaday|6 years ago|reply
I used to work for a long defunct storage manufacturer, reached the point where we skipped either even or odd (i can't remember) firmware versions on the seagate drives. RMA counts would always be higher on the even/odd number.
[+] iforgotpassword|6 years ago|reply
Funny how everyone chimes in with their Seagate horror stories. My experience with them isn't much better.

But another fun story: I had a PSU blow up a couple years ago in a machine with three WDs and three HGST. All WDs were dead after that, the others worked flawlessly. Probably not a large enough sample size for any definite conclusions but at least it put a failure mode on my radar that wasn't there before.

[+] Hamuko|6 years ago|reply
I've ever only had one drive failure. It was a 3 TB Seagate. I remember it failing in like three years, so after the warranty had already expired.

My oldest drive is a 500 GB Western Digital from 2008 that's still operational today. I imagine its end is near, but I've thought about that for a couple of years now.

[+] noja|6 years ago|reply
Every failed drive I have owned was a Seagate too.
[+] kevin_thibedeau|6 years ago|reply
I'll never buy Seagate again after losing a drive because of a firmware bug that prevents it from coming online.
[+] briffle|6 years ago|reply
Interesting how the checksumming process sounds very much like zfs's scrubbing process. One of the reasons I trust zfs with my large data volumes is because it proactively looks for problems and fixes them. (and most filesystems really can't look/check)
[+] gwern|6 years ago|reply
> By increasing the shard integrity check rate, we potentially moved failures that were going to be found in the future into Q3. While discovering potential problems earlier is a good thing, it is possible that the hard drive failures recorded in Q3 could then be artificially high as future failures were dragged forward into the quarter. Given that our Annualized Failure Rate calculation is based on Drive Days and Drive Failures, potentially moving up some number of failures into Q3 could cause an artificial spike in the Q3 Annualized Failure Rates. This is what we will be monitoring over the coming quarters.

Wouldn't survival analysis on interval-censored data handle this problem automatically? All of your observations of failure presumably are actually interval data, where all you know is that the drive failed sometime in between the last good check and the first bad check. Then it doesn't matter if some time periods have large intervals and others have small intervals, that just affects the precision of estimates.

[+] h1d|6 years ago|reply
The only reason I use other storage provider than backblaze is simply because of the benchmark done by one of the more modern backup tool author.

https://github.com/gilbertchen/cloud-storage-comparison/blob...

Can anyone from backblaze say anything about their performance compared to other vendors?

The pricing is certainly ahead of others, so I would use if the performance is comparable to some of the leading group tested there.

[+] atYevP|6 years ago|reply
Yev here -> well that chart hasn't been updated in a while. For starters we're just $0.01/GB for downloads (we dropped the price last year). Our performance is generally pretty good, and we're partnered with cloudflare (free egress) if you need more umph. But most of the time folks don't have any issues with just our regular service.
[+] riobard|6 years ago|reply
I must have missed it somehow, but what is the difference between boot drives and data drives in a typical Backblaze server other than the boot drives store the OS? Obviously you don’t need 8TB capacity solely for the OS, I’d assume you also store user data on boot drives? In which case why is there a distinction?
[+] atYevP|6 years ago|reply
Yev here from Backblaze -> Mainly it's the OS and log files - the reason we make a distinction is that the boot drives typically do not have as much load as the data drives, so it wouldn't be a real 1:1 comparison.
[+] ksec|6 years ago|reply
I wonder If Backblaze will eventually offer some sort of consumer solution to Backup. An App on iOS, Android, Mac and Windows that simplifies backup to your Backblaze NAS, that Backup to B2 as well.
[+] roddux|6 years ago|reply
I've been following these since maybe 2016? I don't remember when they started. It's striking to note that for as long as I recall, HGST still holds the crown of lowest annualised failure rate across the board.
[+] kortilla|6 years ago|reply
Off-topic but sort of related: is there a hard drive tower designed specifically for a pool of SSDs? The one I have for 8 3.5 bays could easily fit twice as many of the little SSDs...
[+] myself248|6 years ago|reply
The term "mobile rack" will find some of the densest ways to shove, for instance, eight 2.5" drives into a single 5.25" bay. That can get you some serious density in one of those "cdrom duplicator" tower cases that's all drive-bays, but heaven help you on the controller side.
[+] mikece|6 years ago|reply
And is there any reason to not trust WD Red drives in my home NAS?
[+] brandmeyer|6 years ago|reply
The plural of anecdote is not data. The point of backblaze's publication is to share statistical data with the public at large. Their dataset doesn't include any WD Red drives, therefore, the data does not provide a statement about WD Red drives' reliability in either direction.

Consider this: the least reliable drive in this dataset has a 2.7% annualized failure rate at an average age of almost 4 years.

That's low enough that many SOHO users will never see a failure, yet high enough that a rack worth will have seen more than one failure recently. Thus, your question could be answered with complete honesty in both ways: with anecdotes by happy users and with anecdotes by unhappy users.

Therefore, neither positive nor negative answers are useful to you.

This phenomenon underlies the fundamental weakness of self-reported online reviews. You cannot actually get a useful measurement for how reliable a product is solely by self-reported sparse feedback.

[+] ahnick|6 years ago|reply
You shouldn't trust any drive. Trust in probabilities to protect you. You should be backing up your NAS to either a cloud provider, an external drive (if data will fit), or another NAS. Make sure to locate one copy offsite and practice restoring from your backup.
[+] zlynx|6 years ago|reply
I trust the Reds. Although, as someone else just said, it's an anecdote, not real data like the Backblaze report.

I started my NAS in 2012 with two 2 TB Red drives. Later I added two 6 TB Red drives. Some time, around 2016 I think, one of the 2 TB Reds failed and I replaced it with another 6 TB. Then the other 2 TB Red failed, like a year later and I put in another 6 TB so they all matched. I did not get replacements even though one failed during its warranty period (I am pretty sure the 2 TB Reds still had a 5-yr warranty at that time), because I wanted to replace it with a 6 TB anyway.

Currently all four 6 TB drives are still running, plus a couple of 4 TB Toshibas I grabbed at some point.

So, I don't think that drive failure after 4 or 5 years is so bad. I got my money's worth anyway, and that's why a NAS has redundant drives.

[+] kabdib|6 years ago|reply
All drives will fail. You're just buying time (or rather, the likelihood of getting more time out of the drive).

I've had a couple of WD reds fail. You pop in another and rebuild the volume. It's normal as long as it's not too often.

[+] bscphil|6 years ago|reply
Practically speaking there's no way to avoid them (or their white-label equivalents) if you're on a typical home budget and you buy any significant number of drives. They're just too cheap (when you get the Easystores) compared to any other way of acquiring a large amount of disk space.

I'd rather buy WD than Seagate, but that's just me. I don't really have a choice. Maybe the only people who have a choice are either in the enterprise or buying one drive every few years for a PC build or something.

[+] probo23|6 years ago|reply
Is Blackblaze used mainly as a backup solution? Can anyone give me a use case?
[+] atYevP|6 years ago|reply
Yev here from Backblaze -> the company started by providing unlimited online backup, and that's a great industry for us. About 4 years ago we released Backblaze B2 Cloud Storage, which allows developers or sysadmins or enthusiasts to directly upload/retrieve data to/from our data centers. Our core competency is data storage - so while most folks do use us for a backup (either of their Mac or PC on the consumer side or servers/NAS devices with B2 Cloud storage) - what we really do is store and retrieve data.
[+] jseutter|6 years ago|reply
Assuming you mean Backblaze and not Blackblaze..

I have data that is important to me (family photos, etc.) on my hard drive. I have a backups of that data running on a Raspberry Pi with an attached drive. If my house gets broken into, or burns down, or hit by a tornado, basically something Really Bad, Backblaze has a copy of my data offsite.

It is tempting to use Backblaze as my only backup, but like they describe on their site, their primary value is as a backup of your backups. Normally you should never have to use them, and if you do, it will be slow. Now if you are in a hurry they offer a service to ship you your data on a thumb drive or hard drive, but that gives you an idea of their primary use.