top | item 6834705

How reddit tried to solve the "new link" problem. Why HN doesn't need a new algo

198 points| jedberg | 12 years ago | reply

This morning I saw two articles on the front page about how HN should change their algorithms. I would contend that an algo change is not the right solution.

Here is what we did to try and solve the problem on reddit.

First, there is the "organic" box at the top of the page. The first link in that box is always an ad, but after that, it shows pseudo random links from the new page (more on that in a second):

https://github.com/reddit/reddit/blob/89f6f1ad9c1babbf520b83c49fa27f509bb5d0ef/r2/r2/lib/organic.py

What this does is give exposure to up and coming links to a lot of people all at once, which helps overcome the luck factor of who is looking at the new page at any given time.

The second solution is the "rising" sort on the new page:

https://github.com/reddit/reddit/blob/89f6f1ad9c1babbf520b83c49fa27f509bb5d0ef/r2/r2/lib/rising.py

The rising sort accounts for how many times the link has been shown in its ranking algo, which helps better new links rise to the top.

The organic box on the front page uses this rising ranking to choose what is in the box, and also contributes to the view counts.

So I would humbly suggest that HN should do as it has done often in the past, and copy reddit's solution here by implementing the rising sort and the organic box.

67 comments

order
[+] jedberg|12 years ago|reply
Why was my title edited?

It said "How reddit tried to solve the "new link" problem/ why HN doesn't need a new algo"

How is this new title an improvement? I would at least expect a comment here when a title is edited as to why so I can learn for next time.

[+] drakaal|12 years ago|reply
Not an admin, so I can't tell you for sure. Lots of times they edit titles they feel are designed to "game" the HN audience. Posting "why hn doesn't need a new algo" if you aren't part of the HN team likely was viewed as either misrepresentation, or an attempt to game.

I don't have any direct insight just my experience in the past.

[+] ClayFerguson|12 years ago|reply
At least they didn't censor you when you posted the note saying they edited your content. If they wanted to they could have just deleted that too. I think any sort of censorship at all, once the word gets out, will be damaging to HackerNews, and they should stop doing it before it becomes their "reputation". Communities like this can form elsewhere. Hopefully they know they have no monopoly, and should err on the side of NON-censorship at all times.
[+] rhizome|12 years ago|reply
As a reader, I'd say that only one title is necessary, so maybe they just took the first one. Also, sorry, but your post is pretty poorly written so it's kind of hard to figure out which of the titles better applies.
[+] amerika_blog|12 years ago|reply
HN uses the editorial approach to content, figuring that having some people edit user-supplied content will make it more standard and enjoyable, like a newspaper.

I have some doubts about this as well.

[+] jawns|12 years ago|reply
I think the techniques Reddit uses to give exposure to deserving links are nice.

Here are just a few other ways you can fine-tune a "link recommendation" algorithm beyond just the standard "show highly rated links at the top" technique:

1) Devote a portion of prime real estate (e.g. homepage) to new or trending links, as Reddit does.

2) Give higher placement to submissions that come from someone whose previous submissions the user has upvoted.

3) Give higher placement to submissions that come from the same source as previous submissions the user has upvoted.

4) Give higher placement to submissions on which a person has commented whom the user has previously upvoted.

One way I think HN, Reddit, and other link-recommendation sites can put power into their users' hands is to allow each user to tweak the recommendations algorithm to suite their own preferences.

For instance, one user might want half their homepage to be filled with trending stories, rather than popular stories. Another user might find Technique 2 above to be useful but might not want to enable Technique 4.

[+] jedberg|12 years ago|reply
Your suggestions sound great on the surface, but I suspect 2-4 would increase the echo chamber problem.

Also, it's computationally difficult to compute 2-4 in real time (reddit used to do a similar calculation a long time ago, under the now defunct recommended section).

[+] slykat|12 years ago|reply
I personally would not want #2-#4. One of the reason I like Hacker News is that it's system is not customized to my personal behavior. I like that front page looks the same to me when I'm logged in and when I'm not (as far as I'm aware). I prefer the wisdom (or madness) of the crowd to the bubble of myself.

I also think #2-#4 would result in the diversity of topics & sources on my front page to erode.

[+] arh68|12 years ago|reply
If my HN frontpage were that user-dependent, I'd want to see both. I'd read my HN, but I will definitely browse in Incognito mode just to see the 'real' HN.

One has to wonder what draws all these people to sites like HN. I don't think they know for sure. At first, the site was great without me. There were lots of interesting links without me asking for them. As it becomes more amplified, with the front page being hotly contested & measured, mechanisms getting more complicated, etc., it seems we may get what we never wanted.

[+] pbhjpbhj|12 years ago|reply
>"One way I think HN, Reddit, and other link-recommendation sites can put power into their users' hands is to allow each user to tweak the recommendations algorithm to suite their own preferences." //

This.

When scores were removed this was my reaction; if you don't want scores why does that mean I can't have them?

Diverse algos for ranking would also work against gaming of the system IMO.

[+] strict9|12 years ago|reply
This issue isn't what need solving. The much larger problem is the "unknown or expired link" page.

What year is this? Why are we still accepting an implementation detail as an excuse for an awful user experience?

[+] nacs|12 years ago|reply
Yeah this is the biggest problem with HN.

Go to front page, click an article or 2 and click to go to next page and its already "expired" in minutes. For a site that caters to startups/developers its pretty embarrassing.

[+] fnbaptiste|12 years ago|reply
Yes. I hit that page almost every time I hit the 'more' link.
[+] ancarda|12 years ago|reply
HN still uses tables for designs. There's a mix of CSS and <font> tags. The login/register page has no styling at-all. Searching is handled by a third-party (HNSearch). Your top-bar color won't persist on some pages (ie. submit).

HN has a lot of issues but yes, by far the most annoying is the link expiration.

[+] arh68|12 years ago|reply
Well what's the solution?

Do you simply redirect all "expired link" GETs to the home page? Should we have a fixed # of 10 first pages (like some imageboards)?

[+] eggbrain|12 years ago|reply
I think the biggest issue is that the Hacker News admins want to have as few moving pieces as possible. It's why Hacker News doesn't have collapsable comments, why it doesn't have a mobile layout, etc. There's been some tweaks here and there, but I think the biggest change I've noticed over the past ~3 years I've been here is that they removed karma count from comments, and they made the up-vote triangle high resolution.

They seem feature-adverse, and I assume it is because A) Adding more features lead to more causes of failure, B) Front-end/back-end code additions leads to higher page file size/more computation on the backend (meaning higher costs for them to deliver content) and C) the K.I.S.S Principle

[+] icefox|12 years ago|reply
I would have to put my money on the way simpler answer D) A few users might complain, but the traffic keeps growing so why bother and E) YC is eating up all my time and is more interesting and F) The site clearly works (the traffic keeps growing) so it isn't as interesting to hack on anymore.
[+] vacri|12 years ago|reply
I would like it to have fewer moving parts. I'm not fond of 'unknown or expired link' - that link shouldn't 'move' :)
[+] joelrunyon|12 years ago|reply
How hard would it be for them to simply make the site completely responsive? That's not a huge task & still keeps things pretty simple.
[+] recuter|12 years ago|reply
I understand the endless desire to tweak, I really do, but HN simply doesn't have anywhere near the volume of posts Reddit does.

This place has the same feel as: http://www.reddit.com//r/depthhub

Most of the action is in the comments and a lot of the traffic is from lurkers. Its a slow roll in other words and frankly there is a very finite amount of good quality new posts to be had.

That is the real problem - try and rank this place more optimally by hand as an experiment, it simply won't take that long. Where is all the great content you are trying to algorithmically float?

[+] jedberg|12 years ago|reply
I don't actually think HN has a problem, I was just suggesting that if I were wrong about that, here is an alternate solution.
[+] cLeEOGPw|12 years ago|reply
What I think they should at least do is remove penalty for 40+ comments. I see where this is coming from, the desire to always have "quality discussion" effect by showing only those comment sections with very few comments, usually quite insightful as it usually come from people interested enough in the often obscure topic, but what it also does is kills discussion. At least they could make penalty growing gradually, not something that suddenly activates as soon as there are 40 comments.
[+] amerika_blog|12 years ago|reply
That place is the worst of the self-congratulatory, narcissistic, Reddit hive-mind.
[+] gabemart|12 years ago|reply
Does the "rising" page on reddit actually work? On subreddits with millions of subscribers it seems to show one or two submissions, at best, and often the votes cast on those submissions don't reveal why they're considered to be rising. For example, many submissions on the rising page have scores, using the format (upvotes,downvotes), of (1,0) or (1,1)

Is this the intended behavior?

[+] jedberg|12 years ago|reply
Yes. Remember, it is a function of both votes and views, so something with two votes that's only been viewed a few times will still be higher than something with more views and two votes.

It's not perfect, but it is better than just straight chronological.

[+] Houshalter|12 years ago|reply
The solutions presented were better and more mathematically sound. This might work, but that algorithm maximizes the exposure of deserving posts and optimizes the tradeoff between testing new posts and sticking with proven ones. This is just arbitrarily throwing new links to a top box that everyone ignores.

And reddit's solution doesn't seem to be working much better IMO, tons of posts get buried with little exposure. The rising section seems to be usually empty or just 2 or 3 totally random posts.

Reddit's algorithm is often criticized for heavily favoring quickly consumed content like images because they get vote quicker. Also easily manipulated by bots/sock-puppets.

[+] mburst|12 years ago|reply
Not sure if a rising sort is really needed. A link only needs a few points to get on the front page of HN. Based on my experience if a link gets 3 votes in the first 15-20min or 5 in an hour and it'll get some frontpage time.

I think it may almost be easier to just show a random new link from the past hour rather than doing anything fancy. I'm sure a ton of good content misses the frontpage just because of the sheer lack of visibility that links on the new page get.

[+] shaunrussell|12 years ago|reply
I think the real problem is that most of us come to the site too often... take a stretch and go outside :)
[+] joshdance|12 years ago|reply
I think just giving a small portion of the front page to new stories could help a lot.