top | item 3746274

Ask HN: How is HN's ranking algorithm implemented?

2 points| jgannonjr | 14 years ago | reply

There is an older article from a post about a year and a half ago titled "How Hacker News Ranking Algorithm Works" (http://news.ycombinator.com/item?id=1781013). I understand how the algorithm works mathematically but how is it implemented on the server? More specifically, how does the server keep its rankings updated if each stories rank decays as a function of the time since submission? The ranks must be getting continuously calculated at some point, but I'm not sure when that would be--it doesn't make sense to calculate them on request, and there are way too many stories to be constantly processing all of them all the time. My thought is that the rank is recalculated after each vote, and there must also be some rank threshold at which a story's rank is automatically processed and recalculated continuously in the background, this way the recent stories are constantly being processed while older ones past the threshold become stale and are ignored. This could also be why there is no real pagination (instead the next url is /x/fnid=xxxxxxxxxx), possibly to discourage people from going too deep. I could also be completely wrong. Can anyone shed some light on this?

4 comments

order
[+] cd34|14 years ago|reply
http://arclanguage.com/install

contains the source code less some of the secret bits and bytes.

[+] jgannonjr|14 years ago|reply
Thanks for the link. I by no means have any experience reading lisp, not to mention there is a lot of code here to sort through (especially for someone not versed in lisp), but from what I can make out from a quick glance seems to confirm my initial guess. It looks like rank is calculated when an item is voted on. Also, it looks like a random front page story is reranked regularly to ensure no items get "stuck" at the top if voting stops (rerank-random). There's also a function (gen-topstories) to generate the rankings for the top 1000 stories, although I can't tell if this is being run regularly (it looks like it is just used to initialize the rankings). However, I could definitely be wrong.

I'm not able to determine how often the rerank-random function is called (looks like every 30ms maybe?) or how often the gen-topstories function is called (if it is even being called regularly or not).

Can you verify what I'm saying here is correct, or correct me if I'm wrong? Also, any insight would be much appreciated.

[+] _3ex7|14 years ago|reply
Just by observing, I think it goes like this:

    if linked_article.is_relevant_to_YC_company():
        goto front_page;
    elsif linked_article.is_negative_to_YC_company():
        linked_article.dead();