(no title)
brianwski | 3 years ago
> Is there a failure list for write heavy drives?
To be clear about what these drive failure stats are and what they are not: Backblaze runs a data storage service with about 214,000 hard drives in it right now. We don't run any specific tests or induce issues on purpose FOR the drive failure stats, we just report what occurred in our datacenter.
Sometimes readers think we're carefully running a "study", but it's more just what we have experienced as honestly as we can offer it up. If the reads and writes and seeks our drives experience matches your particular application, great! Or maybe it is just interesting to read.
Now, we do save (and publish) all the raw data, and some other awesome people out there have done various analysis on it, which always makes us happy also. You can find the raw data here: https://www.backblaze.com/b2/hard-drive-test-data.html At this point it goes back almost a full decade.
stonecharioteer|3 years ago
I'm building https://github.com/stonecharioteer/renfield for this purpose. Before I get around to it, I'm trying out git-annex, but I must say I don't like the git approach to files.
isomorphic|3 years ago
Someone with some time could correlate that to failure rate. My hypothesis is of course it's correlated--but by how much?
brianwski|3 years ago
We looked into it a little, some notes written up here: https://www.backblaze.com/blog/what-smart-stats-indicate-har...
Short summary is there are a few SMART stats that seem to predict failure way more than others, which is probably obvious. But we aren't PhDs in statistics and it isn't our area of focus, so....
This guy wrote a paper based on the Backblaze SMART data: https://etd.ohiolink.edu/apexprod/rws_etd/send_file/send?acc...
These 5 guys wrote another paper based on the Backblaze SMART data to train up a Bayesian network to predict failures: https://ieeexplore.ieee.org/document/8489097
This is another article of predicting hard drive failures using the Backblaze SMART data: https://karthikna.github.io/Prediction-of-Hard-Drive-Failure...
I can't comment on their findings, but it's DEFINITELY an interesting thing to study now that we have almost 10 years of these drive stats across a pretty big drive farm.