top | item 12985114

WikipediaP2P

247 points| bpierre | 9 years ago |wikipediap2p.org | reply

112 comments

order
[+] oceanofsolaris|9 years ago|reply
I understand that this is an early implementation.

Nevertheless, I have a question: how does it make sure the site you receive is actually the

a) correct one (someone could distribute incorrect pages e.g. about controversial topics like North Korea or Climate Change; or inject some malicious code that e.g. uses your account to submit changes to wikipedia)

b) most recent one (or at least a reasonably new one).

I imagine the first one could maybe be somehow guaranteed through the protocol. Maybe you can just solve the second by invalidating your local copy after it is more than one day old.

The github link on the homepage does sadly not work for me (firefox on linux), so I can't check.

[+] shmageggy|9 years ago|reply
This doesn't answer your question, but it says it's based on CacheP2P, which itself says (http://www.cachep2p.com/api.html) it's 99.99% WebTorrent (https://webtorrent.io/).

It doesn't look like WebTorrent is meant to deal with content that is continually changing, and I wasn't able to determine (after about 30 seconds of looking) how either of the two layers on top address that

[+] osense|9 years ago|reply
> controversial topics like [..] Climate Change

You're making me sad on a Friday afternoon.

[+] guerrerocarlos|9 years ago|reply
- Github link fixed. - Ideally Wikipedia could add just a signature in the page metadata and the extension could check it against it's public key. - CacheP2P has one solution implemented (checksums of links contents defined in the metadata of each page), but would require Wikipedia to enable it.
[+] discreditable|9 years ago|reply
A better way to support Wikipedia would be to donate. $10 probably covers more than the strain you put on Wikipedia for a year.

https://donate.wikimedia.org/

[+] atom_enger|9 years ago|reply
I encourage a repeating donation. If you work a tech job you can very likely afford 5/month. The site has changed my life for the better and I constantly find myself on there satisfying whatever is piquing my curiosity at the moment. To be clear, I'm not affiliated in any way just a huge supporter.
[+] gield|9 years ago|reply
For the sake of testing it, I have opened nearly every link in the introduction of the Nikolas Tesla [0] page, the one used in the example. It should now be possible to at least obtain all those articles through me, and hopefully also through other people.

[0] https://en.wikipedia.org/wiki/Nikola_Tesla

[+] qznc|9 years ago|reply
It burns CPU like crazy for me. Would not use it on battery.
[+] janus24|9 years ago|reply
Doesn't seems to work for me, all links are orange.
[+] rfrank|9 years ago|reply
I feel like I should contribute more than a pun, but this is a missed opportunity for wikip2pedia.org. The domain is available.
[+] pmlnr|9 years ago|reply
We'd be better off by putting wikipedia on IPFS and IPFS into p2p cache.
[+] hobofan|9 years ago|reply
I agree, but sadly IPFS still needs some optimization for that to be feasible.

I published the whole Wikidata dataset as separately accessible entities to IPFS and the initial publishing took ~2 weeks. Theoretically updating the dataset with weekly changes should be pretty fast after that, but if there is a change that impacts the JSON structure of every entity you have to start all over (which recently happened).

Right now Wikipedia/Wikidata is only really useful as a stress test for IPFS, but I'm optimistic for the future.

For anyone interested in this there is some (slow) progress on this at https://github.com/ipfs-wikidata , though it's admittedly a low priority for me right now, as I wait for the technical state of IPFS to improve.

[+] goodplay|9 years ago|reply
If it's execution is similar to this project where hosting a page on your system is effortless (or at most a one-click affair), then it's definitely something I can get behind.

You'll get the benefit of insuring off-line availability as well as only hosting pages you care about. Win-win.

[+] cabalamat|9 years ago|reply
This could be a useful application to run on IPFS, which IMO needs a killer app to become widespread.
[+] tscs37|9 years ago|reply
The problem is that it requires installing third-party software to function (the IPFS client) or putting load on some IPFS gateway.
[+] teekert|9 years ago|reply
I've been wondering, would one reach the same principle if one would pin the root folder of wikipedia on their own server using IPFS and then if wikipedia would point their domainname to the ipfs.io gateway with a hash for that folder, would Wikipedia auto update on my server and would IPFS provide the load balancing/p2p part?

Or am I understanding IPFS or the ipfs.io gateway wrong? Does everything go through the ipfs.io server in that case? Or is it still distributed/torrent-like?

[+] lgierth|9 years ago|reply
For Wikipedia it would probably be more useful to embed js-ipfs [0] into pages, and fetch additional pages and assets after first one from ipfs. js-ipfs can currently speak WebRTC to peer with other js-ipfs nodes, and websockets to peer with go-ipfs nodes.

https://github.com/ipfs/js-ipfs

[+] tscs37|9 years ago|reply
If you want to do IPFS, you need to run IPFS locally to make it work, otherwise you're just redistributing bandwidth to the ipfs.io gateway.

IMO the presented solution is a bit better since it only relies on WebRTC to establish P2P, no extra software required.

[+] nextweek2|9 years ago|reply
I remember thinking about a similar concept a few years back. It would be good to have a p2p archive of human knowledge for end of the world scenarios.

The problem is Bittorrent isn't the protocol for it, it doesn't allow incremental changes. You also don't want the complete history like git, you want something that passes a diff around.

It would be great to have a p2p network with Wikipedia, a load of academic papers, maps and recipes, which anybody with a computer could contribute drive space to storing.

[+] kem|9 years ago|reply
I agree with your last statement, and think this is a great proof-of-concept for distributed networking.

I worry, though, that this space is fracturing in a way that's hampering adoption. Between Freenet, Zeronet, IFPS, I2P, not to mention things like Retroshare, Matrix, and so forth, there's a lot of redundancy but with important differences between any given two solutions. I'm not sure what can be done about it, but it seems like something in this area needs a network effect to accelerate adoption, but to do that, it needs to be comprehensive in what it offers while also being sufficiently differentiable from other projects. It's great to see so much going on in this area, but it's getting difficult to know what does what, and where one might put some resources.

[+] markovbling|9 years ago|reply
The database dumps are surprisingly small - can store the whole thing on 2 TB hard drive incl media - or 64GB SD card if you just want the SQL database with text and metadata
[+] krzyk|9 years ago|reply
It's a pitty it doesn't have a firefox plugin.
[+] elkos|9 years ago|reply
Should be easy to do.
[+] Izeau|9 years ago|reply
Why the “Read and change all your data on the websites you visit” permission?

Also, the “Fork me on GitHub” banner on your website is behind the fancy canvas so we can’t actually click it (Chrome 54).

Great idea otherwise!

[+] dmytrish|9 years ago|reply
Also, a separate domain, wikipediap2p.org, not (obviously) affiliated with wikipedia.org. If it were p2p.wikipedia.org, it would be fine for me.
[+] Maakuth|9 years ago|reply
I think changing the links to point to the cache instead of the original server needs that permission.
[+] rocky1138|9 years ago|reply
There is no way I'm installing this with those permissions requests. All data on all websites I visit? No way.
[+] stanislavb|9 years ago|reply
This seems like a decent project that has the potential to become very popular. I just tested it and it worked well; however, it clogged my CPU and the extension crushed in the end.
[+] thecatspaw|9 years ago|reply
how does it handle cache invalidation, eg when a page gets updated?
[+] yitchelle|9 years ago|reply
I wonder if this concept could be extended to site aggregator such as HN or Reddit? The amount of bandwidth saved could be significant.
[+] iansowinski|9 years ago|reply
Hmmm, maybe thats the option for saving twitter?
[+] fermuch|9 years ago|reply
Check out mastodon.social :)
[+] gield|9 years ago|reply
I tested it in Chromium really quick. I have one small remark: when opening an article obtained by a peer (a green link) in a new tab, it just opens it in the same tab.
[+] zymhan|9 years ago|reply
So if I wanted to donate my computing resources and bandwidth to hosting this, should I just install the plugin and leave chrome running on my "server"?
[+] grondilu|9 years ago|reply
This seems to be introducing synchronisation issues or something. I've just made an edit on an article, when I go back to it I see the edited version only if I manually refresh the page.