top | item 30542391

(no title)

gargarplex | 4 years ago

I guess blogs that are linked-to in non-killed HN comments should probably be crawled a bit. Have you considered using social user karma (this could be a 1-10 score uniquely calculated for users of each of HN, Twitter, Reddit as long as it's built in a modular way) as a weight in a PageRank style schema?

Here's how I am going to evaluate your search engine. Yesterday I searched Google for "get dynamodb table row count" and found this URL, https://bobbyhadz.com/blog/aws-dynamodb-count-items, which provides a terrible recommendation involving a full table scan.

With DontBeEvil, I didn't find the correct answer, to use the describe-table API.

If you really plan to dedicate a year to this, I would strongly encourage you to re-post again as soon as you have a strong update. Right now this has potential to provide value but really does not. So update us when you have confidence that you might be providing value! But we think you're on to a great opportunity.

discuss

order

alangibson|4 years ago

> I guess blogs that are linked-to in non-killed HN comments should probably be crawled a bit

They are, but there are relatively a few of them because my only page content source is the Common Crawl. The hit rate vs the total urls I'm interested in is not great. I expect to fix this soon.

I'm also not indexing entire sites, only specific upvoted urls. This will change as well.

> Have you considered using social user karma (this could be a 1-10 score uniquely calculated for users of each of HN, Twitter, Reddit as long as it's built in a modular way) as a weight in a PageRank style schema?

Definitely. I've already started in on calculating a rank coefficient for submitters, but it's not completely clear now to best use it yet.

> Here's how I am going to evaluate your search engine

Feel free to dump more of these. Some solid test cases would be very helpful.