Not a file system developer but an ardent ZFS user. Here are my thoughts.
I pick ZFS every time because of its usability, reliability, and being able to easily understand what goes on without being a master of the OS ecosystem it runs on.
For example, pick ZFS, and sit with the ZFS handbook for 15 minutes and you already know how and what to do to:
[Create/modify/destroy] X [Vdev/Zpools/datasets]
A disk gives out, it is easy to k ow what to do to let ZFS deal with it without googling hard and relying on forums about what the right thing to do for your OS is - unless you already know it.
ZFS is not the fastest filesystem to pick with it being a COW fs. But, it does everything so well, feels solidly thought out that it gives a lot of confidence to the user which is very important aspect for a file system.
I hope bcachefs and many other modern filesystems that are vying for mainstream adoption like ZFS keeps these virtues from ZFS. Until then, I would happily try and play with them, but not think about using them in stuff I really care about.
The lack of a zero-copy sendfile for ZFS is one of several reasons that we (Netflix Open Connect) use UFS on our CDN nodes.
It would be interesting to see how he solved this. From a brief look at the FreeBSD code several years ago, I seem to recall that ARC dealt in 8K chunks (maybe because it is the sparc page size??), but x86_64 uses a 4K page size, and mmap() and sendfile() expect to deal in pages.
The other issue would be "taming" the ARC. We make heavy use of FreeBSD's SF_NOCACHE flag to sendfile(). This allows us to teach the VM system to keep hot content in the page cache, and drop less popular content.
Appleās aborted port had this, but it was the source of a lot of bugs and instability. Probably could have been made stable with enough time and effort though.
It's incredible to me how many people are suddenly comfortable violating the license of ZFS, simply because Cannonical says it's okay? This is an open source project, under an OSI approved open source license (CDDL) which is quite known to not be GPL compatible. The license is derived from MPL 1.1, which is also known to not be GPL compatible. Why did they bother creating MPL 2, when we could have just ignored the GPL incompatibility in MPL 1.1?
You can't be someone who loses their mind over GPL violations, but then willfully misinterpret someone else's open source license because they have something you want. The developers involved specifically picked CDDL because it wasn't GPL compatible. That was their desire / intent, and so that's the license they went with. By ignoring the license and the clearly known intent of the developers in choosing that license, simply because they made something you really want, I believe that ZFS on Linux is really a form of theft in the open source world. It's the same sort of attitude that leads to GPL violations.
I don't agree with making things CDDL to be explicitly GPL incompatible, but I do believe in respecting the author of the software in question and the license he or she chose. If people choose licenses I don't agree with, I don't use their software. I don't just use it anyway and pretend like there's no license issue.
You don't have to be RMS to see that these people who are ignoring the licensing issues (not just shipping user mode tools like a FUSE implementation which is legit) honestly don't care about open source and licensing and copyleft. They just see something someone else made that they want and they take it and ignore the licensing and developer intent. How can we hope to enforce the GPL successfully but then ignore other OSI approved licenses when it suits us?
This is why big names in Linux kernel development aren't remotely interested in ZFS.
Not to mention the fact that the users of ZFS on Linux are all setting themselves up for Oracle to bring the licensing inquisition.
It's an open legal question if a user loadable module violates the CDDL/GPL. Until its fully litigated, its going to remain an open question.
As an example: the FSF's position is that yes, if you dynamically link against a GPLd binary that all the code has to be GPLd, and if one wants to link against a copy-left library in a non copy left program the library has to use the LGPL. However, the LGPL predates dynamic linking. It's from the days of static linking where the (L)GPLd could would be distributed together in a single binary. In the world of dynamic linking that isn't true anymore.
A primary example is libreadline. It is (or was?) GPLd and the FSF use this a hammer to try and get other code to be GPL. However, this (as far as I know) was never actually litigated.
As another example: the Linux kernel is under GPL2, Linus makes it clear that closed source programs can use the syscall interface without a problem, but its not clear that if he didn't say this that they couldn't. He includes the note to clarify his position that they can for those that would be concerned. i.e. the argument would be that without the syscall note, one couldn't ship any hardware device (say an appliance you stick in your rack) with linux and proprietary software. My gut is that most would find that difficult to swallow.
As an aside: I personally feel this has similarity to the google / oracle java lawsuit. If one thinks that APIs cannot be copyrighted then arguably anything that just uses APIs and doesn't embed the actual GPLd code one should be fine as the APIs aren't able to be copyrighted. If on the other hand one views APIs are able to copyrighted as well, then it would be much more clear that the FSF's position is correct. Though the usage might still be fair use (and to an extent one might argue that its even more fair use than Google's usage)
If this ever does get litigated, I'll be following closely as would be very interested (intellectually at least) in the result.
TLDR: until its actually litigated, everyone who claims to speak an absolute truth is speaking out of their ass. Lawyers are a conservative bunch in general so they'll give you the most conservative answer in terms of what you can do, that doesn't mean its the absolute truth in terms of what you can and cannot do.
ZFS is a great filesystem and being able to possibly have a better implementation than we do at the moment, or at least, be able to spur additional development and optimization, is very welcome!
vbezhenar|6 years ago
reacharavindh|6 years ago
I pick ZFS every time because of its usability, reliability, and being able to easily understand what goes on without being a master of the OS ecosystem it runs on.
For example, pick ZFS, and sit with the ZFS handbook for 15 minutes and you already know how and what to do to:
[Create/modify/destroy] X [Vdev/Zpools/datasets]
A disk gives out, it is easy to k ow what to do to let ZFS deal with it without googling hard and relying on forums about what the right thing to do for your OS is - unless you already know it.
ZFS is not the fastest filesystem to pick with it being a COW fs. But, it does everything so well, feels solidly thought out that it gives a lot of confidence to the user which is very important aspect for a file system.
I hope bcachefs and many other modern filesystems that are vying for mainstream adoption like ZFS keeps these virtues from ZFS. Until then, I would happily try and play with them, but not think about using them in stuff I really care about.
noja|6 years ago
The name doesn't help. It doesn't sound like a real filesystem (it sounds like something that accelerates a real filesystem)
oofabz|6 years ago
drewg123|6 years ago
It would be interesting to see how he solved this. From a brief look at the FreeBSD code several years ago, I seem to recall that ARC dealt in 8K chunks (maybe because it is the sparc page size??), but x86_64 uses a 4K page size, and mmap() and sendfile() expect to deal in pages.
The other issue would be "taming" the ARC. We make heavy use of FreeBSD's SF_NOCACHE flag to sendfile(). This allows us to teach the VM system to keep hot content in the page cache, and drop less popular content.
lemoncucumber|6 years ago
m0zg|6 years ago
Yeah, bud, good luck with that. Someone hasn't worked with Oracle.
unknown|6 years ago
[deleted]
suprasam|6 years ago
BlackLotus89|6 years ago
Skunkleton|6 years ago
mappu|6 years ago
gigatexal|6 years ago
rincebrain|6 years ago
jgowdy|6 years ago
You can't be someone who loses their mind over GPL violations, but then willfully misinterpret someone else's open source license because they have something you want. The developers involved specifically picked CDDL because it wasn't GPL compatible. That was their desire / intent, and so that's the license they went with. By ignoring the license and the clearly known intent of the developers in choosing that license, simply because they made something you really want, I believe that ZFS on Linux is really a form of theft in the open source world. It's the same sort of attitude that leads to GPL violations.
I don't agree with making things CDDL to be explicitly GPL incompatible, but I do believe in respecting the author of the software in question and the license he or she chose. If people choose licenses I don't agree with, I don't use their software. I don't just use it anyway and pretend like there's no license issue.
You don't have to be RMS to see that these people who are ignoring the licensing issues (not just shipping user mode tools like a FUSE implementation which is legit) honestly don't care about open source and licensing and copyleft. They just see something someone else made that they want and they take it and ignore the licensing and developer intent. How can we hope to enforce the GPL successfully but then ignore other OSI approved licenses when it suits us?
This is why big names in Linux kernel development aren't remotely interested in ZFS.
Not to mention the fact that the users of ZFS on Linux are all setting themselves up for Oracle to bring the licensing inquisition.
compsciphd|6 years ago
As an example: the FSF's position is that yes, if you dynamically link against a GPLd binary that all the code has to be GPLd, and if one wants to link against a copy-left library in a non copy left program the library has to use the LGPL. However, the LGPL predates dynamic linking. It's from the days of static linking where the (L)GPLd could would be distributed together in a single binary. In the world of dynamic linking that isn't true anymore.
A primary example is libreadline. It is (or was?) GPLd and the FSF use this a hammer to try and get other code to be GPL. However, this (as far as I know) was never actually litigated.
As another example: the Linux kernel is under GPL2, Linus makes it clear that closed source programs can use the syscall interface without a problem, but its not clear that if he didn't say this that they couldn't. He includes the note to clarify his position that they can for those that would be concerned. i.e. the argument would be that without the syscall note, one couldn't ship any hardware device (say an appliance you stick in your rack) with linux and proprietary software. My gut is that most would find that difficult to swallow.
As an aside: I personally feel this has similarity to the google / oracle java lawsuit. If one thinks that APIs cannot be copyrighted then arguably anything that just uses APIs and doesn't embed the actual GPLd code one should be fine as the APIs aren't able to be copyrighted. If on the other hand one views APIs are able to copyrighted as well, then it would be much more clear that the FSF's position is correct. Though the usage might still be fair use (and to an extent one might argue that its even more fair use than Google's usage)
If this ever does get litigated, I'll be following closely as would be very interested (intellectually at least) in the result.
TLDR: until its actually litigated, everyone who claims to speak an absolute truth is speaking out of their ass. Lawyers are a conservative bunch in general so they'll give you the most conservative answer in terms of what you can do, that doesn't mean its the absolute truth in terms of what you can and cannot do.
patrickg_zill|6 years ago
ZFS is a great filesystem and being able to possibly have a better implementation than we do at the moment, or at least, be able to spur additional development and optimization, is very welcome!