top | item 7249019

Real-time Search as a Service

92 points| davidbarker | 12 years ago |algolia.com | reply

36 comments

order
[+] nkurz|12 years ago|reply
To keep your users engaged, search results need to show up instantly and be relevant to them, even when they do typos.

To try this out, I went to the demo page for searching TV episodes (http://www.algolia.com/demo) and searched for "The Wire Season 2". Here are the four results given, with the highlighted portions bracketed:

[The Wire] Gag Reel [Season 2]

[The] Simple Life [Season 2] - Special - The Stuff We [Were]n't Allowed To Show You

[The] Farmer Wants a [Wife] (Australia) [Season] 6, Episode [2]

[The] Cosby Show [Season 2] DVD Extra: New Interview with [Dire]ctor Jay Sandrich

Rather than seeking "engagement", I'd put more emphasis on having high quality search results. Having 3 of the 4 results ignore the properly typed title of the show is a terrible interface. Correcting "Wire" to match "Director" is absurd.

The sad part is that these results might make you think that the episodes for season 2 of the "The Wire" aren't in the database, but they are. But they are, just not indexed in a way that they are found using the exact phrase "Season 2".

Trying to be more constructive, there is a typo in the first sentence of your Intro, where the name of your company is spelled wrong. Also, "Real-Time Search" usually means search against a database that is being constantly updated. Anyway, I need to get back to screaming at the kids on my lawn.

[+] jlemoine|12 years ago|reply
(I am the Co-founder of Algolia)

Thanks for your comment, you are right that the data are not perfectly indexed for this query. We have taken data from one of our customer and his use case is to search only for TV show names.

We will improve our demo to cover that case.

Thanks

[+] js7|12 years ago|reply
Also try this:

"Futurama holiday" shows one results which has the word "Episode" in the description. Try "Futurama holiday episode" and you get no results.

[+] MWil|12 years ago|reply
If you're wondering if Algolia is right for you, just ask them. Within 5 minutes of initiating the chat window I had the CEO, Julien helping guide me through the process of getting my XML into JSON to see if it was right.

Then he asked me more about my use case and actually steered me towards an Elasticsearch solution since it sounded like a better fit.

All in all we went back in forth communicating for 3-4 days for him to lose me by necessity and I already feel like a satisfied customer.

[+] MWil|12 years ago|reply
just to be clear, this was about a week ago

obviously the CEO is watching this thread now so he should be quick to grab people now as well, I imagine

[+] anxrn|12 years ago|reply
I don't understand what makes this particular service tout 'realtime' as its primary selling point.

Don't all search engines (and other hosted search services) aim for fast (100s of milliseconds) retrieval, show-as-you-type and realtime indexing?

Don't get me wrong. Getting all this right is very hard, and kudos for the great performance numbers (vs Elasticsearch), but 'realtime search' smacks of marketing copy.

[+] jlemoine|12 years ago|reply
You can try to search-as-you-type on our hacker news search to see the difference with other search engines: http://hn.algolia.com/

You have relevant results after each keystroke, even with typos. Classical engines use approximation to perform instant search, like the suggest module of Elasticsearch.

[+] nestlequ1k|12 years ago|reply
No offense, but I hate your business model. Convincing devs to put their search db in the hands of a small hosted startup is a recipe for disaster (see indextank).

There must be a better way. ElasticSearch and MongoDB use open source business models that I think tend to work much better for smart devs picking technologies (irrespective of their actual products).

[+] kwi|12 years ago|reply
Hmm, from my own experience - yes, there is alternative open search engines, and there are a lot actually. But did you ever try one on your own? Most of them are a nightmare to setup and are atrociously slow as soon as you get a few thousands entries... Sometimes, it's definitely worth it to externalize some expertise. Search is definitely not easy to masterize.

(And I was a user of indextank when they shutdown)

[+] hboon|12 years ago|reply
What was wrong with indextank?
[+] johns|12 years ago|reply
Found this because because hnsearch.com is migrating to it. It's very fast. http://hn.algolia.com
[+] jared314|12 years ago|reply
Unfortunately it does not seem to have the accuracy, or breadth, of the old hnsearch.com. Hopefully this will be fixed in time, but I have found it lacking relevant results and myself switching back to hnsearch on most occasions.

I also wonder about all the other small applications in the "HN ecosystem", like karma tracker, that rely on the hnsearch API. I see that algolia has an API, but will those other projects just die too?

[+] jpdlla|12 years ago|reply
That explains why I noticed a couple of things off on my aggregator. I was using hnsearch.com/rss which recently seemed to have been alter and is now missing most of the data I was actually using.
[+] johnnymonster|12 years ago|reply
All great and good to be very fast, but at what price? From their page it costs $450 for 5mil records. In the search world, this is nothing. So I guess its going to come down to if your company is at the point where they need to shave off 1-200ms for hundreds of dollars a month.

Second, I would wait and see how their reliability hashes out before I rely on them for any production services.

[+] jlemoine|12 years ago|reply
The search world is very big :) 5 mil records is nothing if you index logs (which is not Algolia typical use-cases) but for example this is big from an e-commerce perspective.
[+] jhonovich|12 years ago|reply
How is this different than swiftype? I ask because I am a current switftype user and am trying to understand what your case to switch might be.
[+] ses|12 years ago|reply
I think improving on relevance ranking configuration would be a big boost to this product as well as offering some ability to cross-search multiple indexes. Both are quite difficult problems to solve well in search, but if a simple API service was available that might be attractive for larger commercial customers.

The icing on the cake would be to have some support for relational (at least partially relational) data and multimedia / files. Good luck!

[+] nodesocket|12 years ago|reply
First of all, great job guys. The library support is fantastic (node.js, python, ruby, php, even a shell client). We are currently pushing our nginx logs to ElasticSearch, and was going to use ES for some new features on https://commando.io, but instead we will use algolia.
[+] indiehamad|12 years ago|reply
How does it work with API calls? How many calls are typically made by a real-time search for, say, a 10-letter keyword?
[+] ndessaigne|12 years ago|reply
We usually recommend to perform one query (one API call) per keystroke starting from the first one. The actual number of calls depends a lot on the use-case. Our ranking takes into account both relevance and popularity to suggest the best result first which greatly reduces the number of letters you need to type. In use-cases where there is a very strong popularity indicator, like the number of followers for TV shows, we usually get the correct result at the first keystroke (b -> breaking bad, d -> dexter). At the other extreme, you may need to type several words.