top | item 45603538

(no title)

akvadrako | 4 months ago

If you mostly care about data integrity, then a plain RAID-1 mirror over three disks is better than RAIDZ. Correlated drive failures are not uncommon, especially if they are from the same batch.

I also would recommend an offline backup, like a USB-connected drive you mostly leave disconnected. If your system is compromised they could encrypt everything and also can probably reach the backup and encrypt that.

discuss

ssl-3|4 months ago

Better how?

With RAID 1 (across 3 disks), any two drives can fail without loss of data or availability. That's pretty cool.

With RAIDZ2 (whether across 3 disks or more than 3; it's flexible that way), any two drives can fail without loss of data or availability. At least superficially, that's ~equally cool.

That said: If something more like plain-Jane RAID 1 mirroring is desired, then ZFS can do that instead of RAIDZ (that's what the mirror command is for).

And it can do this while still providing efficient snapshots (accidentally deleted or otherwise ruined a file last week? no problem!), fast transparent data compression, efficient and correct incremental backups, and the whole rest of the gamut of stuff that ZFS just boringly (read: reliably, hands-off) does as built-in functions.

It's pretty good stuff.

All that good stuff works fine with single disks, too. Including redundancy: ZFS can use copies=2 to store multiple (in this case, 2) copies of everything, which can allow for reading good data from single disks that are currently exhibiting bitrot.

This property carriers with the dataset -- not the pool. In this way, a person can have their extra-important data [their personal writings, or system configs from /etc, or whatever probably relatively-small data] stored with extra copies, and their less-important (probably larger) stuff stored with just one copy...all on one single disk, and without thinking about any lasting decisions like allocating partitions in advance (because ZFS simply doesn't operate using concepts like hard-defined partitions).

I agree that keeping an offline backup is also good because it provides options for some other kinds of disasters -- in particular, deliberate and malicious disasters. I'd like to add that this this single normally-offline disk may as well be using ZFS, if for no other reason than bitrot detection.

It's great to have an offline backup even if it is just a manually-connected USB drive that sits on a shelf.

But we enter a new echelon of bad if that backup is trusted and presumed to be good even when it has suffered unreported bitrot:

Suppose a bad actor trashes a filesystem. A user stews about this for a bit (and maybe reconsiders some life choices, like not becoming an Amish leatherworker), and decides to restore from the single-disk backup that's sitting right there (it might be a few days old or whatever, but they decide it's OK).

And that's sounding pretty good, except: With most filesystems, we have no way to tell if that backup drive is suffering from bitrot. It contains only presumably good data. But that presumed-good data is soon to become the golden sample from which all future backups are made: When that backup has rotten data, then it silently poisons the present system and all future backups of that system.

If that offline disk instead uses ZFS, then the system detects and reports the rot condition automatically upon restoration -- just in the normal course of reading the disk, because that's how ZFS do. This allows the user to make informed decisions that are based on facts instead of blind trust.

With ZFS, nothing is silently poisoned.

akvadrako|4 months ago

Better than RAIDZ1, which is what you suggested, in the sense it's more reliable. I didn't say anything about not using ZFS.