Hey HN!
I'm one of the two developers behind Hackerhunt. As much as I love Hacker News and it's ranking algorithm for the front page news, it has a downside for the Show HN submissions. A lot of cool and useful stuff people have actually made themselves gets lost in /shownew without a real chance to get to the right audience. That's where the idea of a curated and categorised, à-la-Product-hunt, list was born.
This is a very early proof of concept and any suggestions on how to make it better are welcome!
Really cool site. I recently asked HN[1] why did the free alternative to PH die and found this postmortem thread[2].
I think it really boils down to making sure that people coming to your site find something new and creative all the time - to help turn lurkers or one-time visitors into repeat visitors. I think PH does that quite well with their podcasts, daily digests, twitter updates (though they are forced but they do work), etc. Also you're building a community site so if the traffic dies in a month keep at it, it looks sites like these take many years to gain that traction. Basically I think you have something really great going here, just make sure to focus on bringing the visitors back and you will definitely have a winner!
If you can nail down the categorization, get some more historical stuff, and maximize that newsletter or just suggestions. This should be a great utility.
Hey, thanks! It's a bit of deep learning magic combined with a few days of manual labor tagging the training set :)
The classifier itself is an LSTM and runs on TensorFlow. As said, this is a proof of concept so we'll try to improve it over time.
I have a similar classification project and I used word2vec to build embeddings from 1GB+ of text, then I just do vector similarity between the article and the topics.
The vector of an article can be obtained by summing the vectors of its words (minus stop words). For a topic you just sum up 5-10 of the topic keywords. You don't need to exhaustively list all the topic keywords because word2vec automatically maps them in close vicinity.
This system has the advantage that you don't need a training dataset. It's unsupervised learning coupled with a small amount of supervised topic pointers.
This is very useful, as a frequent HN user, I find myself strolling down the showHN tab quite often, and the current UI doesn't let you go more than 3 pages deep (~5-6 days old posts).
Nice work! I just noticed a bug though, trying to go to the next page of system software that is sorted by votes doesn't work. Instead of going to the next page, the first page is reloaded with /NaN appended to the URL, as such:
That's really cool! Thanks for making this website, it's absolutely useful, it might save a lot o projects. I can tell that because I have myself posted on Show HN and my submission never made it past the /shownew. Probably because it was not that interesting to HN's audience, but I can imagine how many really cool projects end up buried in there.
Those stories would get onto /show and also the front page if they got more upvotes, so it is indeed a curation problem. If you're willing to do the work of rescuing good submissions that the rest of us missed, that's great! I wonder if we could integrate that back into HN somehow.
Yes, Hacker Hunt indexes all Show HN submissions, including those who never make to the main page. Actually - that was the whole reason to make Hacker Hunt happen.
[+] [-] degif|8 years ago|reply
This is a very early proof of concept and any suggestions on how to make it better are welcome!
[+] [-] kumaranvpl|8 years ago|reply
[+] [-] artur_makly|8 years ago|reply
[+] [-] castell|8 years ago|reply
[+] [-] jamil7|8 years ago|reply
[+] [-] superasn|8 years ago|reply
I think it really boils down to making sure that people coming to your site find something new and creative all the time - to help turn lurkers or one-time visitors into repeat visitors. I think PH does that quite well with their podcasts, daily digests, twitter updates (though they are forced but they do work), etc. Also you're building a community site so if the traffic dies in a month keep at it, it looks sites like these take many years to gain that traction. Basically I think you have something really great going here, just make sure to focus on bringing the visitors back and you will definitely have a winner!
[1] https://news.ycombinator.com/item?id=14584527
[2] https://news.ycombinator.com/item?id=11233967
[+] [-] andrewjrhill|8 years ago|reply
[+] [-] ohadron|8 years ago|reply
[+] [-] lamby|8 years ago|reply
http://i.imgur.com/68OeJ94.jpg
[+] [-] vincnetas|8 years ago|reply
[1] http://europa.eu/rapid/press-release_IP-17-1784_en.htm
[+] [-] ruiramos|8 years ago|reply
[+] [-] prawn|8 years ago|reply
Wondered if maybe having the list for today, then perhaps some other recent options in a slimmer format either beside or below?
Nice work though!
[+] [-] brimstedt|8 years ago|reply
A feature ive been missing on hackernews, that perhaps you'd be willing to add, is a community written tldr for each link.
I.e apart from title and link, a short (200chars or so) description anf tldr.
[+] [-] jacquesm|8 years ago|reply
http://news.ycombinator.com/item?id=2158116
[+] [-] overcast|8 years ago|reply
[+] [-] veli_joza|8 years ago|reply
[+] [-] arquLV|8 years ago|reply
[+] [-] visarga|8 years ago|reply
The vector of an article can be obtained by summing the vectors of its words (minus stop words). For a topic you just sum up 5-10 of the topic keywords. You don't need to exhaustively list all the topic keywords because word2vec automatically maps them in close vicinity.
This system has the advantage that you don't need a training dataset. It's unsupervised learning coupled with a small amount of supervised topic pointers.
[+] [-] fiiv|8 years ago|reply
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] sauravt|8 years ago|reply
[+] [-] edraferi|8 years ago|reply
[+] [-] oblio|8 years ago|reply
if(!res.statusCode===500){ TODO
};
Hee-hee :D
[+] [-] hunt|8 years ago|reply
https://hackerhunt.co/topic/system/votes/NaN
[+] [-] degif|8 years ago|reply
[+] [-] kryptogeist|8 years ago|reply
[+] [-] dang|8 years ago|reply
[+] [-] epicide|8 years ago|reply
http://imgur.com/a/LLZhI
[+] [-] pipu|8 years ago|reply
[+] [-] subsidd|8 years ago|reply
I have a question, does it also index submissions which never make it to the main show page?
Also, shameless plug : I am hosting an event inspired by ShowHn in Hyderabad, India ( showhyd.com )
[+] [-] degif|8 years ago|reply