Nice write up and website. I should snapshot my empty root!
If I’m not wrong, at least some of those sharp edges have been resolved. There was a famous very hard to reproduce bug causing problems with ZFS send receive of encrypted snapshots once in a blue moon, that was hunted down and fixed recently.
Still, ZFS needs better tooling. The user has two keys and an encrypted dataset, doesn’t care what is encryption root, and should be able to decrypt. ZFS should send all information required to decrypt.
The code for ZFS encryption hasn’t been updated since the original developer left, last I checked.
In my view, in this case, you could say ZFS nearly lost data: it ties dataset settings in a pool and doesn’t send the necessary settings for reproduction when one of them is replicated. The user is clearly knowledgeable in ZFS and still almost lost data.
Why zfs freak out is accepted as "normal" in a dev environment is beyond me. I use storage spaces on a daily basis in production and dev environment and have for nearly 10 years now and with only marginal use of PowerShell I have been able to restore every array I didn't destroy intentionally. This is the bare minimum I expect out of an redundant array of any type regardless of its speed or scalability promises.
This is the case of user changing password setting and and realizing you can't use them with old backups after accidentally destroying one dataset. zfs is intented for servers and sysdmins so it is not as friendly as some may expect, but it did not lose anything that user did not destroy. Author had to use logic to deduct what he did and walk it back.
OpenZFS has worked fine for me, in mirror mode, for 15 years without anything resembling data loss.
When I had to replace HDDs, the ops were very smooth. I don't mess with ZFS all that often. I rely to the documentation. I must say that IMO the CLI is a breath of fresh air compared to the other options we had in the past (ext3/4FS, ReiserFS, XFS, etc.). Now BTRFS might be easier to work with, I can't tell.
btw, this bug is well known amongst openZFS users. There are quite a few posts about it.
> Lesson: Test backups continuously so you get immediate feedback when they break.
This is a very old lesson that should have been learned by now :)
But yeah the rest of the points are interesting.
FWIW I rarely use ZFS native encryption. Practically always I use it on top of cryptsetup (which is a frontend for LUKS) on Linux, and GELI on FreeBSD. It's a practice from the time ZFS didn't support encryption and these days I just keep doing what I know.
I really love ZFS Native encryption, but this is the big problem with it. I use ZFS Raw Sends to store my backups incrementally in a cloud I trust, but not enough to have raw access to my files. ZFS has great attributes there, theoretically - I can send delta updates of my filesystems, and the receiver never has they keys to decrypt them.
I've used this in practice for many years (2020), and aside from encountering exactly this issue (though thankfully I did have a bookmark already in place), it's worked great. I've tested restores from these snapshots fairly regularly (~ quarterly), and only once had an issue related to a migration - I moved the source from one disk to another. This can have some negative effects on encryptionroots, which I was able to solve... But I really, really wish that ZFS tooling had better answers to it, such as being able to explicitly create and break these associations.
ZFS encryption is much more space efficient than dmcrypt+unencrypted ZFS when combined with zstd compression. This is because it can do compress-then-encrypt instead of encrypt-then-(not-really-)compress. It is also much much faster.
Source: I work for a backup company that uses ZFS a lot.
I use native ZFS encryption because it makes it super easy to share encrypted datasets across dual-booted operating systems. AFAIK Linux does not support GELI and FreeBSD does not support LUKS. DragonflyBSD supports LUKS but then no ZFS.
Also, that way I can have Linux and FreeBSD living on the same pool, seamlessly sharing my free space, without losing the ability to use encryption. Doing both LUKS and GELI would requiring partitioning and giving each OS its own pool.
It's hard to write a completely automated backup test that's also pretty thorough. Yeah it would have caught "completely umountable" but there's a lot of other problems that a basic script has little hope of catching.
I do manual backup checks, and so did the author, but those are going to be limited in number.
This all seems unbelievably more complicated and prone to failure than just doing luks over mdadm. You could just skip this weird, arcane process by imaging the disks, walking them to where they needed to be, then slapping them into the other machine and mounting them as normal.
I do not understand making RAID and encryption so very hard, and then using some NAS in a box distribution like an admission you don't have the skills to handle it. A lot of people are using ZFS and "native encryption" on Archlinux (not in this case) when they should just be using mdadm and luks on Debian stable. It's like they're overcomplicating things in order to be able to drop trendy brand names around other nerds, then often dramatically denouncing those brand names when everything goes wrong for them.
If you don't have any special needs, and you don't know what you're doing, just do it the simple way. This all just seems horrific. I've got >15 year old mdadm+luks arrays that have none of their original disks, are 5x their original disk size, have survived plenty of failures, and aren't in their original machines. It's not hard, and dealing with them is not constantly evolving.
Reading this gives me childhood anxiety from when I compressed by dad's PC with a BBS pirated copy of Stacker so I would have more space for pirated Sierra games, it errored out before finishing, and everything was inaccessible. I spent from dusk to dawn trying to figure out how to fix it (before the internet, but I was pretty good at DOS) and I still don't know how I managed it. I thought I was doomed. Ran like a dream afterwards and he never found out.
There are very real reasons to use ZFS instead of the oldschool Linux block device sandwich.
mdadm+luks+lvm still do not quite provide the same set of features that ZFS alone does even without encryption. Namely in-line compression, and data checksumming, not to mention free snapshots.
ZFS is quite mature, the feature discussed in the article is not. As others have pointed out this could have been avoided by running ZFS on top of luks and would have hardly sacrificed any functionality.
> I do not understand making RAID and encryption so very hard,
I don't use ZFS-native encryption, so I won't speak to that, but in what way is RAID hard? You just `zpool create` with the topology and devices and it works. In fact,
> If you don't have any special needs, and you don't know what you're doing, just do it the simple way. This all just seems horrific. I've got >15 year old mdadm+luks arrays that have none of their original disks, are 5x their original disk size, have survived plenty of failures, and aren't in their original machines. It's not hard, and dealing with them is not constantly evolving.
I would write almost this exact thing, but with ZFS. It's simple, it's easy, it just keeps going through disk replacements and migrations.
> I very nearly permanently lost 8.5 TiB of data after performing what should've been a series of simple, routine ZFS operations but resulted in an undecryptable dataset. Time has healed the wound enough that I am no longer filled with anguish just thinking about it, so I will now share my experience in the hope that you may learn from my mistakes.
As a zfs user employing encryption, that read like a horror story. Great read, and thanks for the takeaway.
I've used zfs and btrfs and while I haven't quite lost data, I have also hit some unnerving pitfalls / sharp edges that have confirmed that I should keep at least one copy using just LUKS + ext4. I like the features but I think the more complicated filesystems bring about other kind of risks.
I am not sure if this is the correct place but pardon me, I was one trying to remove the luksEncryption key and I searched it on stackoverflow thinking that I am going to figure this out myself...
The first thing on stackoverflow permanently made the data recoverable and it was only under the comment that people mentioned this...
My whole data of projects and what not got lost because of it and that just gave me the lesson of actually reading the whole thing.
I sometimes wonder if using AI would've made any difference or would it have even mattered because I didn't want to use AI and that's why I went to stackoverflow lol... But at the same point, AI makes hallucinations too but it was a good reality check for me as well to always read the whole thing before running commands.
> I sometimes wonder if using AI would've made any difference or would it have even mattered because I didn't want to use AI and that's why I went to stackoverflow lol
AI is trained on stackoverflow and much, much worse support forums. At least SO has the comments below bad advice to warn others, AI will just say "Oops, you're entirely right, I made a mistake and now your data is permanently gone".
Did you mean "unrecoverable"? I first read your comment as "ok, the solution is trivially easy so the article is unnecessary", but the rest of your comment implies the opposite.
There wasn't a destroyed pool, it's the harder version of trying to rewind time on the filesystem. It's worth trying once the disks are fully backed up, but it's fussy enough that I can understand why they made it plan B.
Thanks for validating my choice to not use raw send/recv. I know not everyone can avoid it, but it also seemed to be a bit prone to this kind of issue.
aborsy|5 months ago
If I’m not wrong, at least some of those sharp edges have been resolved. There was a famous very hard to reproduce bug causing problems with ZFS send receive of encrypted snapshots once in a blue moon, that was hunted down and fixed recently.
Still, ZFS needs better tooling. The user has two keys and an encrypted dataset, doesn’t care what is encryption root, and should be able to decrypt. ZFS should send all information required to decrypt.
The code for ZFS encryption hasn’t been updated since the original developer left, last I checked.
In my view, in this case, you could say ZFS nearly lost data: it ties dataset settings in a pool and doesn’t send the necessary settings for reproduction when one of them is replicated. The user is clearly knowledgeable in ZFS and still almost lost data.
chasing0entropy|5 months ago
nabla9|5 months ago
This is the case of user changing password setting and and realizing you can't use them with old backups after accidentally destroying one dataset. zfs is intented for servers and sysdmins so it is not as friendly as some may expect, but it did not lose anything that user did not destroy. Author had to use logic to deduct what he did and walk it back.
atmosx|5 months ago
When I had to replace HDDs, the ops were very smooth. I don't mess with ZFS all that often. I rely to the documentation. I must say that IMO the CLI is a breath of fresh air compared to the other options we had in the past (ext3/4FS, ReiserFS, XFS, etc.). Now BTRFS might be easier to work with, I can't tell.
btw, this bug is well known amongst openZFS users. There are quite a few posts about it.
MuteXR|5 months ago
One that should not exist, of course, but certainly not a normal one.
wkat4242|5 months ago
This is a very old lesson that should have been learned by now :)
But yeah the rest of the points are interesting.
FWIW I rarely use ZFS native encryption. Practically always I use it on top of cryptsetup (which is a frontend for LUKS) on Linux, and GELI on FreeBSD. It's a practice from the time ZFS didn't support encryption and these days I just keep doing what I know.
GauntletWizard|5 months ago
I've used this in practice for many years (2020), and aside from encountering exactly this issue (though thankfully I did have a bookmark already in place), it's worked great. I've tested restores from these snapshots fairly regularly (~ quarterly), and only once had an issue related to a migration - I moved the source from one disk to another. This can have some negative effects on encryptionroots, which I was able to solve... But I really, really wish that ZFS tooling had better answers to it, such as being able to explicitly create and break these associations.
binwiederhier|5 months ago
Source: I work for a backup company that uses ZFS a lot.
tom_alexander|5 months ago
Also, that way I can have Linux and FreeBSD living on the same pool, seamlessly sharing my free space, without losing the ability to use encryption. Doing both LUKS and GELI would requiring partitioning and giving each OS its own pool.
Dylan16807|5 months ago
I do manual backup checks, and so did the author, but those are going to be limited in number.
pessimizer|5 months ago
I do not understand making RAID and encryption so very hard, and then using some NAS in a box distribution like an admission you don't have the skills to handle it. A lot of people are using ZFS and "native encryption" on Archlinux (not in this case) when they should just be using mdadm and luks on Debian stable. It's like they're overcomplicating things in order to be able to drop trendy brand names around other nerds, then often dramatically denouncing those brand names when everything goes wrong for them.
If you don't have any special needs, and you don't know what you're doing, just do it the simple way. This all just seems horrific. I've got >15 year old mdadm+luks arrays that have none of their original disks, are 5x their original disk size, have survived plenty of failures, and aren't in their original machines. It's not hard, and dealing with them is not constantly evolving.
Reading this gives me childhood anxiety from when I compressed by dad's PC with a BBS pirated copy of Stacker so I would have more space for pirated Sierra games, it errored out before finishing, and everything was inaccessible. I spent from dusk to dawn trying to figure out how to fix it (before the internet, but I was pretty good at DOS) and I still don't know how I managed it. I thought I was doomed. Ran like a dream afterwards and he never found out.
throwaway240403|5 months ago
ZFS is quite mature, the feature discussed in the article is not. As others have pointed out this could have been avoided by running ZFS on top of luks and would have hardly sacrificed any functionality.
yjftsjthsd-h|5 months ago
I don't use ZFS-native encryption, so I won't speak to that, but in what way is RAID hard? You just `zpool create` with the topology and devices and it works. In fact,
> If you don't have any special needs, and you don't know what you're doing, just do it the simple way. This all just seems horrific. I've got >15 year old mdadm+luks arrays that have none of their original disks, are 5x their original disk size, have survived plenty of failures, and aren't in their original machines. It's not hard, and dealing with them is not constantly evolving.
I would write almost this exact thing, but with ZFS. It's simple, it's easy, it just keeps going through disk replacements and migrations.
3abiton|5 months ago
As a zfs user employing encryption, that read like a horror story. Great read, and thanks for the takeaway.
mentalpagefault|5 months ago
nextaccountic|5 months ago
kalaksi|5 months ago
j45|5 months ago
Imustaskforhelp|5 months ago
The first thing on stackoverflow permanently made the data recoverable and it was only under the comment that people mentioned this...
My whole data of projects and what not got lost because of it and that just gave me the lesson of actually reading the whole thing.
I sometimes wonder if using AI would've made any difference or would it have even mattered because I didn't want to use AI and that's why I went to stackoverflow lol... But at the same point, AI makes hallucinations too but it was a good reality check for me as well to always read the whole thing before running commands.
jeroenhd|5 months ago
AI is trained on stackoverflow and much, much worse support forums. At least SO has the comments below bad advice to warn others, AI will just say "Oops, you're entirely right, I made a mistake and now your data is permanently gone".
edwcross|5 months ago
heavyset_go|5 months ago
ysleepy|5 months ago
zpool import -D
https://openzfs.github.io/openzfs-docs/man/master/8/zpool-im...
I haven't tried this, but I gather from the blog post that it would have been much simpler as it didn't require any of the encryption stuff.
Dylan16807|5 months ago
ed_db|5 months ago
asmor|5 months ago
jgalt212|5 months ago
lofaszvanitt|5 months ago