This may be a little off topic, but I used to think RAID 5 and RAID 6 were the best RAID configs to use. It seemed to offer the best bang for buck. However, after seeing how long it took to rebuild an array after a drive failed (over 3 days), I'm much more hesitant to use those RAIDS. I much rather prefer RAID 1+0 even though the overall cost is nearly double that of RAID 5. It's much faster, and there is no rebuild process if the RAID controller is smart enough. You just swap failed drives, and the RAID controller automatically utilizes the back up drive and then mirrors onto the new drive. Just much faster and much less prone to multiple drive failures killing the entire RAID.
halfcat|11 years ago
The problem is that once a drive fails, during the rebuild, if any of the surviving drives experience an unrecoverable read error (URE), the entire array will fail. On consumer-grade SATA drives that have a URE rate of 1 in 10^14, that means if the data on the surviving drives totals 12TB, the probability of the array failing rebuild is close to 100%. Enterprise SAS drives are typically rated 1 URE in 10^15, so you improve your chances ten-fold. Still an avoidable risk.
RAID6 suffers from the same fundamental flaw as RAID5, but the probability of complete array failure is pushed back one level, making RAID6 with enterprise SAS drives possibly acceptable in some cases, for now (until hard drive capacities get larger).
I no longer use parity RAID. Always RAID10 [3]. If a customer insists on RAID5, I tell them they can hire someone else, and I am prepared to walk away.
I haven't even touched on the ridiculous cases where it takes RAID5 arrays weeks or months to rebuild, while an entire company limps inefficiently along. When productivity suffers company-wide, the decision makers wish they had paid the tiny price for a few extra disks to do RAID10.
In the article, he has 12x 4TB drives. Once two drives failed, assuming he is using enterprise drives (Dell calls them "near-line SAS", just an enterprise SATA), there is a 33% chance the entire array fails if he tries to rebuild. If the drives are plain SATA, there is almost no chance the array completes a rebuild.
[1] http://www.smbitjournal.com/2012/11/choosing-a-raid-level-by...
[2] http://www.smbitjournal.com/2012/05/when-no-redundancy-is-mo...
[3] http://www.smbitjournal.com/2012/11/one-big-raid-10-a-new-st...
Twirrim|11 years ago
In reality you'll find that drives perform significantly better than that, arguably orders of magnitude better.
That said, I'm still not a fan of RAID5. Both rebuild speed and probability of batch failures of drives (if one drive fails, the probability of another drive failing is significantly higher if they came from the same batch), make it too risky a prospect for my comfort.
feld|11 years ago
coned88|11 years ago
Forlien|11 years ago
t0mas88|11 years ago
The main reason is not only the rebuild time (which indeed is horrible for a loaded system), but also the performance characteristics that can be terrible if you do a lot of writing.
We for example more than doubled the throughput of Cassandra by dropping RAID5. So the saving of 2 disks per server in the end required buying almost twice as many servers.
ntsb|11 years ago
golan|11 years ago
Then, you need to come up with other creative solutions, like deep syncing inside the FS tree, etc. Fun times.
Add checksumming, and you probably would like to take a holiday whilst it copies the data :-)
grammar|11 years ago
greenknight|11 years ago
Disks were connected via SATA 3 and was using mdadm
dspillett|11 years ago
There are a few ways to improve RAID rebuild time (see http://www.cyberciti.biz/tips/linux-raid-increase-resync-reb... for example) but be aware that they tend to negatively affect performance of other use of the array while the build is happening (that is, you are essentially telling the kernel to give more priority to the rebuild process than it normally would when there are other processes accessing the data too).
click170|11 years ago
Edit: To clarify, I was using mdadm, not hardware raid.
LeoPanthera|11 years ago
craigyk|11 years ago
Resilvering one drive usually takes < 48 hours. Last time took ~32 hours.
Something else cool: When I was planning this out I wrote a little script to calculate the chance of failure. With a drive APF of 10% (which is pretty high), and a rebuild speed of 50MB/sec (very low compared to what I typically get) I have a 1% chance of losing the entire pool over a 46.6 year period. If I add 4 more raidz2 10X3TB VDEVS that would drop to 3.75 years.
XorNot|11 years ago
diplomatpuppy|11 years ago
fiatmoney|11 years ago
kyrra|11 years ago
beachstartup|11 years ago
three cost levers: capacity, speed, and downtime.
basically, at some point you will begin to realize the cheapest cost is the capacity itself, everything else is an order of magnitude more difficult to justify. downtime and slowness are simply unacceptable, whereas buying more stuff is perfectly acceptable and quite feasible solution in an enterprise environment.
hobs|11 years ago
I work with a number of medium sized businesses of which you can clearly tell who have had and who havent had major downtime due to these types of hardware failures.
Many people who have never experienced issues have no interest in even forming a basic DR plan, but the ones who have just start throwing money at you the moment you mention it.
Buying more stuff is WAY cheaper than slowness and downtime, and penny pinching on this count is going to cause so much additional heartache for what will in the end cost more anyway.