I like the idea, but I think a p2p search should also include human filtering and trust. I.e. i know the keys of my trusted friends and sites/pages they've "approved" rank higher for my searches.
Once you implement this, there would be lots of privacy concerns. But when done by smaller company / non-commercial entities, the concerns might be lesser.
Um, am I alone in getting a sinking feeling that the word "security" appears only in one place on their site, and it's in regard to having your searches snooped? Hint to FSF: not everyone on the web is an altruist.
I tried a few test searches, and didn't seem to get many useful results at all. Searching for [debian] did not produce any debian.org results anywhere on the first page. Similarly, searching for [google] did not produce google.com (or any other google domain) on the first page. Searching for [lwn] produced one random LWN comment, but nothing else. Searching for [linux] produced a page full of links to the Wikipedia articles on Linux in numerous different languages, in no sensible order.
Interesting code base. Java with a templating engine (de.anomic.server.serverObjects) I have never seen before. Worth some reading time.
Bigger picture: YaCy would need to reach a large critical mass of nodes before being useful, so it would seem to be difficult to get enough people to donate server resources.
Also, it is not clear how to keep anyone from doing SEO by running nodes that make it a priority to spider promoted web sites.
Is there a way they check the validity of every peer database ? What if I edit the index in my computer so that my website comes at top of the result for high competing terms (similar to google link bombing) ? And if I can lease 100s of computers, then I would be first on the result ...
So this spreads the index across all nodes, right? I probably don't want an index of the entire web on my hard drive. But at the same time, how efficient can it be to hit a bunch of different nodes every time I search? How is it going to affect me when people hit my node?
I downloaded the peer software... how do I know to how many peers I am connected, how do I know what does my computer actually do, and why does "local" yacy returns 0 results to everything?
[+] [-] jshen|14 years ago|reply
[+] [-] nmridul|14 years ago|reply
[+] [-] JulianMorrison|14 years ago|reply
[+] [-] JoshTriplett|14 years ago|reply
[+] [-] mark_l_watson|14 years ago|reply
Bigger picture: YaCy would need to reach a large critical mass of nodes before being useful, so it would seem to be difficult to get enough people to donate server resources.
Also, it is not clear how to keep anyone from doing SEO by running nodes that make it a priority to spider promoted web sites.
[+] [-] jshen|14 years ago|reply
[+] [-] xorglorb|14 years ago|reply
[+] [-] nmridul|14 years ago|reply
[+] [-] andrewflnr|14 years ago|reply
[+] [-] pyre|14 years ago|reply
[+] [-] runn1ng|14 years ago|reply
Questions, questions, questions.