top | item 32833907

Show HN: A search engine based on RSS feed

134 points| daviducolo | 3 years ago |github.com | reply

23 comments

order
[+] toa697|3 years ago|reply
Love the idea behind this.

Generally any new way of attempting to find signal among the noise of the internet is good, and I think RSS is niche enough to automatically remove a ton of noise from the dataset, and the fact this will mostly target articles and blogs gives it a distinct flavor.

Gonna put this with the marginalia astrolabe on my "small web search" bookmark folder.

[+] assadk|3 years ago|reply
What other stuff do you have in this 'folder'?
[+] skilled|3 years ago|reply
I like this idea a lot but this specific project is done pretty badly. There's a lot of spam and noise in results, and no real filtering system in place.

It would be awesome to create something like this from a massive archive of personal blogs in spaces like tech, development, design, etc. Basically, a massive RSS reader curated by the community itself.

[+] kbyatnal|3 years ago|reply
I was very excited by this and it's promising, but it needs a bit more work to be usable. I was hoping the RSS technique would help avoid the SEO and blogspam that Google is now filled with. Unfortunately, I still get quite a bit of that (my query was "espresso machine" and I got a bunch of listicles, etc).

I think applying a layer of curation on top would go a long way to fixing this.

[+] mgr86|3 years ago|reply
quick take, wouldn't it potentially compound the blogspam? Probably one of the largest providers of RSS these days are WordPress installs.
[+] gaius_baltar|3 years ago|reply
That's nice. Will it start crawling for new feeds or they will be manually curated?
[+] daviducolo|3 years ago|reply
each user can add the feed they prefer then the system will automatically import the data daily
[+] 0x9b|3 years ago|reply
Super cool project!

I tried solving the search quality problem (for technical content like engineering blogs) a while back by filtering using heuristics based on websites/urls. At one point I was experimenting with RSS, but found that many websites only show the past X number of entries and gave up that direction since it excluded too much content.

Are you seeing a similar problem now?

[+] edding4500|3 years ago|reply
Fantastic, great idea! The amount of job offers when searching for 'software engineering' is a bit annoying, though.
[+] pasttense01|3 years ago|reply
The major problem is the selection of the RSS feeds. If you look at the list of feeds you see a lot of traditional news feeds like CNN or New York Post. These are well covered in Google/Google News and are fairly low quality.
[+] daviducolo|3 years ago|reply
yes you are right but the list is chosen by the users. I am working on an algorithm for ranking the entries in order to get the quality results
[+] aendruk|3 years ago|reply
> At the moment is not possible to add source Feed if you have feed proposals open an issue with the URLs to add

What are the long-term plans for this? Building a crawler? Or just manual curation?

[+] llamataboot|3 years ago|reply
so far a few searches haven't shown one relevant result, and a lot of strange spam