top | item 13961176

Ask HN: Is there room for another search engine?

149 points| _6cj7 | 9 years ago | reply

193 comments

order
[+] Animats|9 years ago|reply
That's a good question, and something I've spent much time on. Cuil (2008-2010) tried. I knew some of those people. It cost them about $30 million to launch a full scale search engine. They had no revenue model. In retrospect, they were hoping to be acquired by somebody. It was some ex-Google people, trying to replicate older Google technology. They had a great launch, but the system wasn't very good and traffic rapidly fell off. Their technology wasn't that great. Their big selling point was that they could do the job on less hardware than Google used.

Yahoo had a search engine from 1995 to 2009. Yahoo is now a Bing reseller. There was a period around 2007 when Yahoo search was better than Google search. They pioneered integrated vertical search: special cases for weather, celebrities, and such. But Google copied that.

Blekko (2010-2015) had a scheme with "slashtags" which attracted a small following but never caught on. They were trying to crowdsource part of the problem. Eventually, Blekko was acquired by IBM's Watson unit, and ceased offering public search.

Bing, Microsoft's entry, remains active. Microsoft seems to have given up on trying to raise Bing's market share. Bing no longer has a CEO of its own; it's just a miscellaneous online service Microsoft provides. It's still #2 in search, but only has 7% market share.

There remain a few little search engines. Ask, formerly Ask Jeeves, continues to operate, but has only 0.17% market share. Ask is from IAC, in Oakland, a spinoff of Barry Diller's Home Shopping Network. Excite, formerly Excite@Home, with 0.02% market share, continues to operate. Excite, in its day, was a hot startup powered by too much venture capital.

Outside the US, there's Baidu (China) and Yandex (Russia). Neither has much traction outside their home countries.

It's possible to do a better search engine than Google from the user perspective. It's not clear how to get it to profitability. There are two things Google does badly - business legitimacy and provenance. Google doesn't background-check businesses online. (I do that with Sitetruth; it's not only possible, it could be done better with a tie-in to costly business background services such as Dun and Bradstreet.) This allows bogus and marginal businesses to reach the top of search via the usual SEO techniques. Google is also bad at provenance - figuring out that site A is using text derived from site B, and thus B should be ranked higher. This is what allows scraper sites to rank highly in Google.

Fix those two problems, and a new search engine could be better than Google. Whether anyone would notice is questionable. Profitability would be tough. The reward for success is high. Search ads are more relevant and more profitable than any other form of advertising. When someone sees a search ad, they're actively looking for the item of interest and may be ready to buy. Almost all other ads are interruptions or annoyances. That's the basic reason for Google's success.

[+] thesmallestcat|9 years ago|reply
There are way more than two things that Google does wrong. Remapping my search terms into oblivion so it can pretend it's fast is the worst one. Especially when this happens to a query I've modified to quote "every" "single" "flipping" "term." I think Google is cheating, and that their usable index is much shallower than they'd have you believe.

What's needed is a search engine with functional queries (as opposed to Google, which now only operates in "the user is drunk" mode), that doesn't give a damn about your robots.txt, and that can capture content in a way that is more akin to archive.org than Google's shoddy and increasingly absent cache.

Another issue is spam/false matches. Why does Google return illegitimate results? Because, let me tell you, any search for "some nifty computer book pdf" returns pages upon pages of bogus links leading to ad link mazes. A crawler should be able to trivially crawl such a page, determine that no PDF is linked, and blacklist the result, but this doesn't happen.

Google is slow and preoccupied. Their business is ripe for disruption.

[+] dhimes|9 years ago|reply
When someone sees a search ad, they're actively looking for the item of interest and may be ready to buy.

My experience with Google is the opposite of this. When people are searching google they are looking for information. When people go to Amazon they have their credit card in their hand.

Most of my google-ad clicks (85% last check) are clearly not interested in buying what they are looking for. They are only on my page for seconds. If I try to qualify my ads better I lose 'quality score,' which is a measure that is entirely, at least to a first-order approximation, about whether or not people will click the ad. That, if my ad say 'This site is X' I get a good quality score but shitty leads. If I say 'This site is X for $49.99' I get shitty quality score but the people who come to the site (the clicks I have to pay for) are ready to buy.

The profitability isn't because their initial hypothesis of Search Marketing (that people are searching what they want to purchase and will therefore purchase) was correct, but because they are far-and-away the search winners and if you want to be found at all you have to play it.

At least, in my experience and IMHO.

[+] 19eightyfour|9 years ago|reply
One could argue that a monopoly has no interest in improving service.

Also, it seems to me that both weaknesses described, legitimacy and provenance, are actually good for Gs core ad business, because if legitimate advertisers could reliably rank highly without paying, they'd have no interest to pay to out rank the competition. The competition who are buoyed by the nebulous practices Google pretends to frown upon, but for the sake of a delicate balance cannot justify completely stamping out.

The perhaps unpalatable and unutterable truth is that ad-supported search is a delicate business: be good enough to attract eyeballs, but not so good that advertisers who could pay you have no need to pay.

[+] DanBC|9 years ago|reply
> Search ads are more relevant and more profitable than any other form of advertising. When someone sees a search ad, they're actively looking for the item of interest and may be ready to buy.

ActionCookBook accurately recreates my experience. https://twitter.com/actioncookbook/status/834439563032555521

   ME: [views product online] Hmm. Nah.
   [changes website]
   PRODUCT: we meet again
   ME: Sorry, no [changes again]
   PRODUCT: bitch this ain't over
   ME: FINE. FINE. I will buy this rug. Just leave me at peace.
   REST OF INTERNET: this dude loves rugs, let's get him, boys
[+] bambax|9 years ago|reply
Thanks for this excellent comment. I wonder if this is true though:

> Google is also bad at provenance - figuring out that site A is using text derived from site B, and thus B should be ranked higher.

The part I dispute is the last part. What matters is the user perspective. So of course a site that does nothing but scrape another one should rank lower, but many scrapers add value, if none other than UI-wise. So it's not obvious that the site of origin should rank higher.

[+] Vinnl|9 years ago|reply
7% for Bing? That's huge!

(Also, how about DuckDuckGo? I've got the (admittedly gut) feeling that it should at least outperform Ask and Excite.)

[+] tvural|9 years ago|reply
"Fix those two problems, and a new search engine could be better than Google. Whether anyone would notice is questionable."

Yeah, I don't think many people will care about the difference between good and perfect. You might be able to find a niche in search that Google is ignoring, but you would have a hard time expanding from there into general search.

In that sense Google is a bet against technology - you would invest in Google if you believe the Web's going to stay the same for a long time and nothing will replace it.

[+] tima101|9 years ago|reply
Thanks for summary. My 2 cents: I think Amazon search ads are more profitable than Google's. When user searches for product on Amazon, this user is only one click away from buying.

Edit: There should be a differentiator for new search engine, there is more room (problems to solve) in Q&A search and Discovery.

[+] arekkas|9 years ago|reply
I was very young when Google became popular. Why did it become so popular? Was it just the technological advance of Google-Matrix (Page Rank) and Map-Reduce?
[+] awqrre|9 years ago|reply
Cuil had a really aggressive spider but never delivered anything useful ... I ended up blocking them to clean up logs.
[+] jacquesm|9 years ago|reply
I miss DuckDuckGo in your line-up.
[+] cocktailpeanuts|9 years ago|reply
The short answer is YES, but the long answer is, if you're thinking about building a startup, this should NEVER be the question you ask.

All successful companies that came out of nowhere and disrupted already stable industry never started out thinking "How do i build another X", "How do I disrupt X"? They all built something they thought was needed by the world and it went onto somehow "disrupt X".

So if you're starting out thinking "I want to build a search engine if there's room for another.", that will never work because you don't even know what you're solving, you will be frantically searching for the question throughout your "startup" life.

[+] pedalpete|9 years ago|reply
Thank you so much for writing this @cocktailpeanuts, I'm surprised there is so much discussion around the 'yes/no' how to make the decision.

Ask better questions, and you'll get better answers.

So, the question shouldn't be 'Is there room for another search engine', but perhaps 'Would a better search engine ...', 'what comes after search engines'.

I think even looking at the 'flaws' of google aren't really going to give you a game changer. You'll find a whole that Google can easily fill.

[+] sairamkunala|9 years ago|reply
Companies like Algolia which provide a site specific search engine has been doing really well especially with the speed and relevancy where Google currently is not concentrating on.

https://www.algolia.com/

[+] sergiotapia|9 years ago|reply
Algolia is a game-changer. They made it so incredibly simple to add search to your website. I'm not talking about their widgets, I'm talking about their server-side integrations and their javascript client-side lib.

It's like magic. See it in action here: https://stackshare.io/match

[+] greglindahl|9 years ago|reply
It's a crowded market, even though Google has basically withdrawn from it.
[+] cagenut|9 years ago|reply
This is more a feature than a different search engine, but I so so so wish I could de-prioritize blogspam. 300 - 1000 word text-heavy writeups of a couple of facts where a few bullet points, an image, a graph, a map, or a data table would be much much better. Google has been SEO'd to death because of its block-of-text lowest common denominator favoring.
[+] nkristoffersen|9 years ago|reply
never heard the term "blogspam" but that is a very good description of it. When I click a link and then notice it is just a short blog post from some company trying to "content market", I leave the page.
[+] throwayedidqo|9 years ago|reply
I write blogspam. It's definitely a problem but I don't know what Google is going to do about it.

Basically google can't trust backlinks anymore because people game them and competitors try to destroy each other's sites by buying scummy links to their stuff.

So they mainly attempt to measure quality in a vacuum. This is using their machine learning stuff to look at the quality, confidence, and reading level of the writing style.

They do the same quality checks for the site. Checking for EV certs, clean markup, real email volume through Gmail, reputable DNS provider, physical address in G maps. A lot of their hundreds of quality metrics don't measure the site itself, but use Google's pervasive data trove from their other services. Most scammers don't bother doing any of this right.

The problem becomes people like me. I setup sites with all measures of quality for legitimate businesses. Have articles written by good writers with knowledge in the subject. Sounds great right?

The problem is that these articles are still done for money and quite biased sometimes. Google is slowly running into a need for a strong AI because all measures of quality can be emulated if enough money is on the line. It doesn't matter if something seems truthful in every way except the fact that it isn't.

This is the same reason "fake news" is invading google and Facebook. Smart spammers have upped their game to the point that it's impossible to know what's real anymore.

Need a wikipedia article changed? Good reviews on Yelp? A nice piece on a popular tech website? All of this can be openly bought with zero consequences.

[+] rgovind|9 years ago|reply
Yes. In today's search engines, I cannot give you a blacklist and say filter out these results. If I am looking for tutorials, I cannot say no video results. If I am looking for market research, I cannot filter out news websites from the links. For personalization, I cannot give google any suggestions on what I absolutely do not want to be included etc.
[+] greglindahl|9 years ago|reply
blekko had these features, and almost no one used them. The google guy who teaches advanced Google searching says that almost no one uses Google's advanced search, either. So if this is a viable niche, you'll have to figure out how to find these users...
[+] garysieling|9 years ago|reply
I think there is definitely space for niche search engines - there are tons of them already, if you include things like the DPLA, octopart, iconfinder.com, Spotify or class-central.com.

Google is focused on getting you to a relevant result quickly, but having a search engine that helps you discover new things is really useful. If you focus on a niche, you can also make use of a lot of metadata Google doesn't retain.

I'm exploring this on a small scale with https://www.findlectures.com. Having the date a video was made gives it a 'street view for history' feel, and lets me rank historical content differently from conferences (where recency is more important).

Building a graph of talks, conferences / speakers / books / publishers could be the building blocks for a pagerank implementation, or to build a different type of book search. Alternately, I think it would be interesting if search engines let you do LSA style queries, like "Brian Goetz" - "Java" + "Python", to help discover speakers.

[+] adamnemecek|9 years ago|reply
I think that definitely. Google lacks in quite a few areas, I think that

a.) storing more data about the sites (and doing something interesting with said data)

b.) improve the UI/UX for power users. The best part is that I can imagine that there would be quite a few people who would pay actual money for being able to use a better search engine. Note that the Bloomberg terminal, is, among others, a search engine. For example, you could make the link graph explicit, you would immediately see what sites link to what sites.

E.g. symbol search really leaves something to be desired on google. I also wish I could use regular expressions. I get it, they are expensive, but like even a little "expressiveness" goes far.

c.) i would pay A LOT for a good search engine for code.

[+] boto3|9 years ago|reply
What kind of code search are you thinking of? In my experience, code search could be useful when one works with a big and unfamiliar code base, but even then good architecture documentation and a good IDE would help more. And when one really needs string search, `git grep` is usually fast enough (for me on a 5GB code base).
[+] boyter|9 years ago|reply
Regarding C any reason searchcode.com is not meeting your needs? I would be happy to add it in. You can download your own version as well.
[+] chewxy|9 years ago|reply
Ben Boyter runs searchcode.com, you should check it out - he's on hn as boyter too
[+] dhimes|9 years ago|reply
Here's what I want in a search engine:

Charge me $15-25 dollars per year

Let me decide what demographic information I wish to share- make it easy for me to control and help me protect my information. Because you are charging me money you can afford it and I trust you.

Give me two search options: one, I'm only seeking information. two, I'm looking to buy. Do this for me as an advertiser: help me qualify the clicks I'm paying for

Perhaps allow me to pay per 1000 impressions (CPM) instead of per click.

By the way, I would also subscribe to a facebook that did this.

[+] boto3|9 years ago|reply
Google is making 50 USD per user, potentially an order of magnitude more from a US user from ads, so I am quite certain that your offer of $25 USD is a low baller :)
[+] stupidcar|9 years ago|reply
For a general search engine, no, there isn't.

The upfront capital investment, in terms of the data center capacity necessary to make a modern scraping and search infrastructure, is immense. And since the ad-word business model does not scale linearly with market share – e.g. the market leader collects a disproportionate share of the available profit – you will be losing additional money for a long time.

Since the market leader is good enough that it isn't possible to disrupt the market purely through result quality (as Google did), you will need to rely on bigger and more effective marketing spend. Not only will you have to outspend and outperform Google, but also Microsoft/Bing, who have tried to do the same thing for years, with only limited success.

Even if you have the funding necessary to do all of this, then you would be better off either buying shares in an existing search engine company, or starting a business in a different market, one with lower upfront costs and less dominant incumbents.

[+] dri_ft|9 years ago|reply
> the ad-word business model does not scale linearly with market share – e.g. the market leader collects a disproportionate share of the available profit

Why is this so?

[+] mark_l_watson|9 years ago|reply
Although I also use Google search and Microsoft Bing, probably more than 80% of my search is done with DuckDuckGo.

The fact that a lot of us DuckDuckGo, and I hope they are profitable, is evidence that there is room for other search engines.

I would like to find a good substitute for Facebook, but the fact that so many people I know use it, that I always need to check Facebook two or three times a week to not miss out on stuff since many friends and family don't use email anymore.

Attending the Decentralized Web Conference last year got me excited about using smaller and Decentralized services. Gnu Social is pretty good, but requires work to find interesting people to follow.

[+] cassowary|9 years ago|reply
Instead of Facebook, just talk to your friends and family semi-often. If one of them has a baby or goes to Europe a lot of them will know and someone will mention it. On occasion you will hear about something three years after the event but that's still okay. It worked perfectly well for thousands of years and it still works today.
[+] jasode|9 years ago|reply
If by "search engine", you mean something similar to Google/Bing then probably not.

However, if we expand the concept of "search" to something beyond text on webpages and "engine" to something beyond a linear algebra pagerank problem that weighs url links, there's room for many more competitors.

Let's say we want to search for "best restaurant":

Method #1 might be searching millions of web pages, twitter posts, newspaper archives, etc where ngram such as "best restaurant" is mentioned. That's what Google/Bing engines already do.

Method #2 might rank restaurants by collecting crowd-sourced opinions. That's what Yelp & Tripadvisor does. (Although Google also piggybacks on their data and lists yelp pages in SRP.)

Method #3 might be a company like Visa/Mastercard analyzing their billions of transactions[1] and based on actual spending amounts & frequency of a billion cardholders, they can also provide their own calculation of a "best restaurant". (I know that Visa/MC already offer limited marketing data to some entities but they don't surface that data to every day web surfers.)

The idea is that there's plenty of room for more imaginative scenarios of #2 & #3. The common theme is that Google doesn't have the data (e.g. credit-card transactions) and therefore, the new "search engines" can give fresh answers that Google algorithms can't provide. To try and boil it down to a simple question: "What interesting answers can a new engine provide that _can't_ be extracted from the text of webpages?"

Btw, I ran across some posts from a Microsoft employee (but not a Bing team member) stating his opinions on building competing search engines. https://news.ycombinator.com/item?id=7011472

[1] http://marketrealist.com/2016/10/why-visas-processing-and-in...

[+] eridius|9 years ago|reply
Method #3 would be for calculating popular restaurants, not best restaurants. For example, I bet McDonald's would rate pretty highly with that approach, but it's very nearly as far away as you can get from the idea of a "best restaurant".
[+] AznHisoka|9 years ago|reply
BuzzSumo.com is a good example as well.
[+] larrydag|9 years ago|reply
I definitely think there is room for search improvement. I believe the next area of search is contextual search (https://en.wikipedia.org/wiki/Contextual_searching). If you can combine what the user is looking for to actual website content then I think you might be onto something. The trick is finding that link function. Traditionally Google has relied on keywords and ranking by links. There could be other ways to find that user/content relationship.
[+] Skylled|9 years ago|reply
I'd think there would have to be something fundamentally different. It would have to be hardly recognizable as a "search engine."

There's too many clones on the market right now. Some with good purpose, like DuckDuckGo which can be simplified to "Google but without privacy invasion." Others like Bing could be just "Google but clunkier." (my opinion)

If you've got an idea on your hands that can't be described as "Google but..." then there's definitely room for another.

[+] tarr11|9 years ago|reply
Take a look at how DuckDuckGo built up their business around privacy first, and leveraging Google when appropriate.
[+] atemerev|9 years ago|reply
Some ideas:

1) a good sitewide search engine. Google's offer is laughable, and Algolia is too developer-centric (requires pushing the data through API). What I'd want is a single input field where I can put my site's main page URL — and get a working search in a few minutes.

2) subscriptions / monitoring. I want to monitor some event or topic, and I want the updates to be delivered to e.g. my WhatsApp/Telegram/Slack/whatever, with smart filtering, refining etc (in lieu of frantically Googling / redditing / refreshing Twitter feed)

3) context-preserving interactive search, that can ask me questions/ refine results.

4) Timeline search interface for news / events / company history etc. I want to be able to put the name of a person, or company, or TV series, and get a comprehensive timeline view of all things happened there.

I have a lot more ideas, and zero free time :(

[+] LiamBoogar|9 years ago|reply
1) Swiftype (or any of Algolia's integrations - WordPress, ZenDesk, Shopify, Magento, ...) - here you talk about Algolia's developer focus, but the rest of your arguments are about the consumer experience. All search engines are built by developers/engineers, and Algolia delivers end-user experience on Twitch, Periscope, Medium, and even HackerNews (hn.algolia.com), which are exactly what you're looking for. You can actually use Algolia to create all the search engine experience ideas you have, and it takes less time (which you don't have)

2) Mention (http://mention.net)

3) Jelly (didn't work. Maybe there's a reason?)

4) Google / Wikipedia.

Unless you can build something 10x better than what exists,

[+] Razengan|9 years ago|reply
I think search engine functionality will sooner or later need to be incorporated into the core specifications for the internet, like DNS.

I mean, the modern idea of the internet is pretty much useless without a search engine, and we've been spoiled by the power we get through Google — the phrases on this very page get indexed within literally seconds; I just tried a literal search for a sentence from a 3-minute-old comment here — but it's really not a good idea for a single company to have so much authority.

This really isn't something to keep relying on a small handful of companies for, especially once we have interplanetary internet. :)

[+] yitchelle|9 years ago|reply
Yes. Google is too generic and that is great for the internet.

I would look forward to search engines that are topic specific. However, the blocker is having the information available in the first place, so I doubt if this will ever happen.

[+] tommynicholas|9 years ago|reply
Absolutely - Giphy is a great example. There will be plenty of search engines that will grow to prominence around either a niche content type (gifs => Giphy) or a niche feature privacy => Duck Duck Go).
[+] mrharrison|9 years ago|reply
I feel like amazon is my second search engine. So yes there is room with category specific search engines. Reminds me of when people started making specific apps from craigslist sub-categories.
[+] makecheck|9 years ago|reply
It is possible to compete with Google by offering what they used to have: simplicity and speed, and not screwing with results.

Google is beginning to show signs of accidental self-sabotage. Their AMP approach was so aggravating for me on mobile that I literally switched search engines to avoid it. And their insistence on scraping and summarizing things and trying to prevent you from even visiting other sites is slowly ruining even desktop searches. They are in danger of disruption.