top | item 1694049

Please review my API for HackerNews

104 points| ronnier | 15 years ago |api.ihackernews.com

52 comments

order
[+] erikpukinskis|15 years ago|reply
Not sure if you have some other strategies for the URL scheme, but I'd probably use RESTful paths... like:

/users/{username}/posts instead of /by/{username}

or

/posts/{id}/comments instead of /comments/{id}

And how come threads are indexed by userid but posts are indexed by username? These kinds of things are the things that slip by developers and give them headaches. I can easily imagine not noticing the username/userid switch and being like "WTF!? 404?!" for a while.

I would expect the standard REST paths would a) make it easier to guess the paths and b) allow for simpler URL generation in client apps (you can generate the url for a user and then just tag on /comments or /posts to get the url for those things)

[+] blasdel|15 years ago|reply
I'm sorry, but that isn't what RESTful means at all. Structured URIs like you mention can be pretty, but are completely unrelated to REST. In the real world they are often used in an anti-REST way, specified in advance instead of linked via hypermedia. If the client needs to use foreknowledge to construct URI strings, that goes against everything REST stands for.

REST is Hypertext As The Engine Of Application State — the default modus of ActionController::Routing::Routes has nothing to do with it.

A design that uses only opaque UUIDs as names for resources and reveals them to the client via links in the responses is perfect REST. Clean-looking URIs are a distraction, except that they tend to be easier to preserve across software rewrites.

[+] ronnier|15 years ago|reply
Great advice, I agree with you. I'll work on changing it and the docs, but leave the existing paths for sometime. I've learned something, this was all worth it :)
[+] stwe|15 years ago|reply
An API that is subject to change should have a version number as a namespace somewhere in the URL. That way you can have different API versions running and it makes less painful to go forward.
[+] edparcell|15 years ago|reply
How timely. I just started writing a library to scrape data from Hacker News because I wanted to put the posts I'd upvoted in the sidebar of my blog.

Link: http://blog.edparcell.com/how-i-added-my-hacker-news-saved-s...

Your API has advantages and disadvantages against this approach: On the upside, it provides a uniform way for all languages to access content from HN, which is really cool.

On the downside, all requests through your API have to flow through your server - this makes me uneasy for two reasons: First that you could switch off your servers, esp. if take-up is high and you are not being compensated sufficiently for running them. And second, because I'm uncomfortable authenticating to an intermediary.

[+] Tichy|15 years ago|reply
If PG isn't opposed to an API, maybe somebody could hack the HN code to add it natively?
[+] anoopengineer|15 years ago|reply
For anyone interested, I have created a Java library wrapping the JSON APIs exposed by Ronnie.

http://github.com/anoopengineer/jhackernews

Currently supports only fetching of News pages - top pages, new pages and ask HN pages. Support for comments and voting to be added soon.

Licensed under Apache 2.0 license.

[+] mcyger|15 years ago|reply
Very cool idea. I will definitely start using it on my iPhone.

Regarding security, you are proxying login credentials through your server. Is that correct? I'd suggest putting up information regarding your privacy policy, if you store any credentials information and the security of your server(s).

[+] ronnier|15 years ago|reply
That's a good idea. I'll put that up tonight.

FYI, I don't store any data at all. The username and password are required to get an auth token from HN, which is only needed for voting and commenting. The token is what's stored in the cookie that HN issues.

[+] sjbach|15 years ago|reply
How about a way to retrieve comments or post ID for a given story URL? I often save articles and read them days or weeks later, and it would be nice if there were a simple way to find the associated HN discussion without risking upvote/story submission using the bookmarklet.
[+] bittersweet|15 years ago|reply
I could probably build something like that on top of ronniers API, not sure because I haven't had a chance to look at it and the api page is not loading for me at the moment.
[+] ronnier|15 years ago|reply
I'm unable to do that because there's not really a way to do that on HN now. I don't store any data so I have nothing to query against.
[+] ethikal|15 years ago|reply
I second this. It would be awesome if you could search by url.
[+] petervandijck|15 years ago|reply
Looks really awesome, congrats. So it scrapes hackernews and then exposes the data as a JSON api?
[+] ronnier|15 years ago|reply
Yes, and caches the data for a couple of minutes.
[+] pvg|15 years ago|reply
It sounds like a plea to be smacked with a banhammer, more than anything else.

It seems to support automating things things that are almost certainly better off left un-automated - posting, voting, commenting.

It doesn't support anything that might be interesting to automate - say, asynchronous notification on replies to me, posting or commenting on my url, mention of my name, mention of keywords I care about, etc.

It asks for HN credentials.

Nit, but still a little lame - lifts the HN favicon.

[+] daleharvey|15 years ago|reply
ihackernews is currently the best browser for hacker news on the mobile by far (its far better than the "app" in the android market).

this api is just extracting what he already built for ihackernews and allowing others to use it, I would be very very surprised if pg banned it, if it causes any issues then they can pretty surely be sorted out.

[+] TamDenholm|15 years ago|reply
Very awesome, needs a search, but yes, I'm sure it'll get use.
[+] Robin_Message|15 years ago|reply
This looks really handy for getting hold of my raw data. One thing -- parentID for comments I fetched with http://api.ihackernews.com/threads/Robin_Message is blank -- is it meant to be or am I missing something? I'd expect it to be the id of the parent comment, and possibly for there to be an "On" field that takes me up to the top level.
[+] ronnier|15 years ago|reply
Thanks, I'll look at this tonight and get it fixed.
[+] thibaut_barrere|15 years ago|reply
Very nice!

Two questions, curious as I'm as well in the process of indexing HN, and your API may help me avoid this:

- how much content are you actually indexing ? Do you keep every single post or only the ones that do it on the home page or ask HN ? How far in time did you go ?

- do you have some way to implement a full-text search (eg: posts that contain a specific word, to be accurate) ?

[+] ronnier|15 years ago|reply
I don't store or save any data, other than an in memory cache. I just scrape, process, and output the data. Since I'm not storing data, I have nothing to search.
[+] sahillavingia|15 years ago|reply
I may very well use this to launch a Hacker News reader for iOS. Is there space for this (would you want it)?
[+] pyronicide|15 years ago|reply
This is awsome, I was looking for something exactly like this last weekend! No more scraping for me =)
[+] mike-cardwell|15 years ago|reply
IIRC there is some Internet Explorer issue involving the "application/json" content type which makes it safer to just use "text/plain". Worth looking up...
[+] RyanMcGreal|15 years ago|reply
You usually access an API programmatically rather than via the browser, so this shouldn't be an issue for most use cases.
[+] abp|15 years ago|reply
Wich language/framework have you used to create it?
[+] gsiener|15 years ago|reply
I thought PG frowned on this kind of stuff?
[+] ronnier|15 years ago|reply
Can you link me? If he does, I'll bring it down.