top | item 45678239

(no title)

contact9879 | 4 months ago

- they don’t need to do anything to conform to your arbitrary organization choices

- hashes are as long or short as you need them to be

- publication timestamp is in every ebook’s metadata, is almost guaranteed to be unique, monotonically increases, and has actual semantic meaning compared to an isbn or oclc

discuss

order

NoMoreNicksLeft|4 months ago

>they don’t need to do anything to conform to your arbitrary organization choices

They don't need to. It'd be smart. It's not "arbitrary". It's fucking library science.

>hashes are as long or short as you need them to be

Hashes might uniquely identify a computer file, but they don't uniquely identify an edition/release of a published book. Some jackass on libgen decides to tweak a single byte, now it has a new hash... but it's not a new edition.

>publication timestamp is in every ebook’s metadata

As someone who takes a look at every internal opf file, no... they're not in every ebook.

You're suggesting I go to the extra trouble of doing a job they could do easily, when I can only do it poorly, and I don't know why... because the first person to respond was a dumbass and thought I was attacking him? I swear, 99% of humans are still monkeys.

ndriscoll|4 months ago

You don't need to hash file contents (though that is often a useful thing to do). You can hash e.g. the URL that was earlier claimed to be the canonical identifier. Running it through your favorite hash function fixes your complaints about file names (choose your favorite hash function such that it is not too long and only outputs allowed characters).