top | item 44168640

(no title)

dezgeg | 9 months ago

I wish checksumming filesystems had some interface to expose the internal checksums. Maybe it wouldn't be useful for rsync though as filesystems should have the freedom to pick the best algorithm (so filesystem checksum of a file on different machines would be allowed to differ e.g. if filesystem block size was different). But so that e.g. git and build systems could use it to tell 'these 2 files under a same directory tree are definitely identical'.

discuss

order

mustache_kimono|9 months ago

I actually once suggested this to ZFS, and saw some push back! See: https://github.com/issues/created?issue=openzfs%7Czfs%7C1453...

Maybe someone else will have better luck than me.

rob_c|9 months ago

Sorry to break it to you. That's not about luck. You've asked for something which is nonsense if you want to "recycle" compute used to checksum records.

If you want them to store the checksum of the POSIX object as an attribute (we can argue about performance later) great, but using the checksums intrinsic to the zfs technology to avoid bitflips directly is a bad call.

hello_computer|9 months ago

If they expose it, that ties them to a particular hash algo. Hash algos are as much art as science, and the opportunities for hardware acceleration vary from chip to chip, so maintaining the leeway to move from one algo to another is kind of important.