top | item 32675835

(no title)

smoochy | 3 years ago

Can someone articulate the differences from zfs? Apart from the fact that, Google's file system seems to be working better with large files and allow simultaneous writing (or, rather, appending) without blinking an eye. But that'd be useful for, maybe very large tech companies.

discuss

order

KaiserPro|3 years ago

ZFS is grand for one machine.

GFS/colossus is designed to offer similar level of features (well in terms of performance and namespace, not snapshots) but over many machines.

ZFS doesn't have to worry too much about latency, or another machine writing to the block that it wants to update. As soon as you break free from one machine, you have to think about how you deal with: where data lives ( is the data stored seperate from the metadata?) how a client signals who has a write lock, permissions (ie user authentication, nfs basically just uses UserIds, which can be faked, unless you are running kerberos/NFSv4) data affinity (pull data from another data center is expensive, do we sync, cache or clone? how do we resolve write clashes?)

All of those questions have an impact on speed, scalability, reliability and durability. You need to choose your designs carefully to get a filesystem that does what you want it to

dekhn|3 years ago

One of the most important differences is that Colossus and GFS are user-space filesystems for the client. You're just making RPCs to a service. There is no kernel filesystem. Upgrading the client and the server is just software upgrade, no kernel changes. And the operations were intentionally restricted- for example, instead of the debacle that is locking in NFS, locking is part of a different (and even more fundamental) tech at Google (chubby, which is NOT a filesystem, or so they say).