top | item 23191501

(no title)

watson | 5 years ago

Last I checked you can't update a torrent. So if Wikipedia changes even a single letter, you'd need to download all the data once more

discuss

order

sktrdie|5 years ago

No, the pieces you downloaded can be reused for the new torrent download. The pieces will effectively have the same hash hence can be reused for the new digest: http://bittorrent.org/beps/bep_0038.html

This is also why sqlite is a good choice because it's highly optimized to do the least amount of changes to its "pieces" when an update occurs.

If you're implementing this behavior, trying to manage all kinds of different queries, building a querying engine on top of that, optimizing for efficiency and reliability, you're effectively rewriting a database. Sure you can do it, but why not take advantage of battle-tested off-the-shelf stuff for things like "databases" (sqlite) and/or "distributing data" (torrent)?

namibj|5 years ago

Actually, there is a solution against this. Just combine https://www.bittorrent.org/beps/bep_0030.html (Merkle-tree-based hashing) with https://www.bittorrent.org/beps/bep_0039.html (Feed-URL based updates), and in some settings also https://www.bittorrent.org/beps/bep_0047.html (Specifically the padding files, so that flat files inside a torrent can also be efficiently shared in arbitrary combinations of non-partial files.).

black_puppydog|5 years ago

All those BEPs are in "Draft" status. Okay, libtorrent implements two of them. But also, BEP 39 (Updating Torrents Via Feed URL) doesn't really fit very well into the fully distributed setting because of the centralized URL part.

So now to update the torrent file you need a mechanism for having a mutable document you can update in a distributed but signed way. Or you could make an append only feed of sequential torrent urls... oh wait.

My point is: Hyperdrive's scope is sufficiently different from your proposed solution that yes, you could probably rely on existing tools (and I have much love for bittorrent based solutions!) but it starts feeling like shoehorning the problem into a solution that doesn't quite fit.