top | item 19827336

Show HN: ZFS Implementation in Python

411 points| alcari | 6 years ago |github.com

86 comments

order

lunixbochs|6 years ago

Note: this was implemented without referencing any ZFS source code and should not be subject to the CDDL.

josteink|6 years ago

So we port this python to something not slow, and all the kernel-people can shut up about ZFS being terrible ;)

newnewpdro|6 years ago

Would the CDDL matter for a python implementation that will never become part of the kernel?

lelf|6 years ago

Bad title. It’s not a ZFS implementation, so hold your horses.

ivanbakel|6 years ago

Pierre Menard, author of The Filesystem.

But I'm surprised this is possible without a specification - how can you test a filesystem through hexdumps? The effects of some operations are going to pretty far-reaching, surely?

randrus|6 years ago

Might be interesting/useful to aim for zfs send/receive compatibility :)

And thanks for the Borges callout.

cryptonector|6 years ago

There are lots of blog posts, lots of docs. There's ZFS code in GRUB that is GPL, etc.

aerovistae|6 years ago

One of my favorite short stories, and nobody else has ever read it. So glad to see someone else reference it.

mfsch|6 years ago

Does someone know whether it would be legal for someone to go through the ZFS code and write a specification of the features this author hasn’t figured out yet? I.e. could someone write a detailed description of the missing functionality that doesn’t include any details about the implementation so other people can implement it in non-CDDL code?

ummonk|6 years ago

Edit: yeah that is how you avoid copyright infringement https://en.wikipedia.org/wiki/Clean_room_design

Original comment: I could swear this was actually the standard practice for writing an implementation of an unknown file format or interface without infringing on copyright. But I don't remember the term for it.

atomicwrites|6 years ago

That's called a clean room implementation and was the standard way to make x-compatible products (like for example, the bios on an IBM PC clone). Not sure what the current legal standing of that method is.

EDIT: Ninjad because I left the reply in a tab without posting.

Someone|6 years ago

The on-disk format is available (http://www.giis.co.in/Zfs_ondiskformat.pdf)

That PDF says ”Unless otherwise licensed, use of this software is authorized pursuant to the terms of the license found at: http://developers.sun.com/berkeley_license.html”*. That link is broken, but it seems that’s Berkeley license (whatever that means for a specification, and for which variant?)

According to http://open-zfs.org/wiki/Developer_resources, its outdated, but still useful.

I think I would use that, rather than spend months diffing disk images.

AlexanderDhoore|6 years ago

What?! How?! Why?!

This is the greatest thing ever. I wish I could just write code for the fun of it. Every time I wonder whether people will use it and give up before I even get started.

js2|6 years ago

I wrote this to scratch my own itch but mostly just for the fun of it and some people ended up using it.

https://github.com/jaysoffian/eap_proxy

This was a really simple project but tickled all my fancies: Python, low-level, networking, reverse engineering, system administration.

Just do it! Who cares if people use it?

Alternatively, contribute something to some open source project you use. I’ve done that too. Just small stuff here and there but that’ll guarantee someone uses your code if that’s what’s important to you. It only takes 39 commits to get on this page:

https://github.com/git/git/graphs/contributors

:-)

max0563|6 years ago

I had this problem too. I’ve been able to get over it by from coming up with a scenario, even if it’s completely fabricated, where what I am doing can be useful. I also make sure that I incorporate something new that I want to learn in the project. Whether it’s a language, library, whatever. Then I give myself a date I can quit. Normally it’s about two months. This makes me really consider whether I want to take something on because if I do I force myself to dedicate two months of time to it. If I enjoy it still at the end of two months then I continue otherwise I move on to another idea. At least in that time I because a little better at whatever I was trying to do. That’s the real goal anyway.

mirceal|6 years ago

why not? writing code can fall into one of a few bucket. one of them is play.

craftyguy|6 years ago

> What?! How?! Why?!

The readme literally answers this..

lelf|6 years ago

No ARC/L2ARC?

Edit: of course not. This is actually just it’s just a ZFS user-facing ”front-end”, not a ZFS implementation.

lunixbochs|6 years ago

It's capable of doing IO against a real ZFS array without any other code. ARC is an implementation detail and not necessary for correctness. If you removed ARC from ZoL it would still work, just slower. ARC is far from the most interesting milestone for a reimplementation effort because an ARC implementation doesn't need to be anything like the Sun version internally, as long as it offers similar performance.

This project is cool not because you're going to run the Python in your kernel today, but because someone can use it as a documented reference implementation of all of the data structures and transactions that is not covered by the CDDL, so another implementation based on this can live in the Linux kernel without problem.

4oo4|6 years ago

This is awesome! Do you plan on blogging anything about how you went about reverse-engineering?

PaulHoule|6 years ago

i love userspace implementations of filesystems.

note that the issues are entirely different from those with a kernel implementation since you aren't having to think about page cache et all.

gaze|6 years ago

Hell yeah, dude. This is awesome.

foxhop|6 years ago

Hey alcari!

I know you from ICV - we used to hang out online on the forums and IRC.

Nice work on this project, I'm looking forward to diving into the codebase!

rashkov|6 years ago

Just curious, what is / was ICV?

RantyDave|6 years ago

This is all kinds of funny. I'm awash with awe and admiration.

garmaine|6 years ago

Simultaneously fucking awesome (that you pulled it off at all), and fucking useless (performance...).

Thanks for sharing though. Maybe could be useful in making a suite of zfs inspection tools?

Is the OP here? How difficult would it be creat a zfs reshaping tool, allowing for the offline expansion of a vdev?