top | item 902999

Ask HN: do you feel Google search result quality has gone down?

112 points| coffeemug | 16 years ago | reply

Yesterday I was surprised to find myself trying Yahoo search because I couldn't get satisfactory search results in Google. It was the first time in years. I started thinking about this, and I realized that in the past few months I haven't been getting particularly good results from Google. I don't get spam or anything, but a lot of times I don't get useful results.

The thing is, I'm not sure if it's because I do a lot of very specialized stuff these days, or because the search quality really has gone down. Consider these two examples:

Search for "Linux asynchronous IO". You'll get a lot of articles, but most are four years old (which is an eternity in the Linux world). These results aren't very good - posix AIO is implemented in userspace threads, and io_submit and friends don't work in many cases. Which cases? Hard to tell - I couldn't find any information in the results no matter how long I searched. I couldn't find any benchmarks either.

Perhaps it's because there is no good info on this on the web (hard to believe). So let's try something else - search for "concurrent hashmap in C". After hours of searching and playing with keywords, I got almost no useful results (other than Intel's libs, but not too much info on that either). It's difficult to believe that there are no good implementations out there.

So, is it the specialized nature of my searches, or is it Google? What do you think?

109 comments

order
[+] joeyh|16 years ago|reply
I've used google since it was google.stanford.edu, and it's clear to me the results have suffered. My feeling is that two of the problems are SEO and feedback effects of google's own popularity.

SEO: When you cut through all the BS, the entire goal here is to make a less good match come first. And it works (sorta). Just consider crap sites like Experts Exchange that we've only learned about because they pollute many searches.

Feedback effect: Thanks to google, less people do less collecting of good links. Why bother when you can google for it? So there's less good information for google to use in ranking links. Bear in mind that when google started, nearly every home page had a long list of links to all the pages that particular user liked and frequently used. I used to have one; I've long since deleted it; my blog has some outgoing links that I like, but relatively few. If I twittered, I'd probably post a lot of outgoing links, but of dubious value; there's no gardening of just the perfect page of 100 links going on anymore.

(I think this also partially explains why some (generally more specialized, so less effected by other things) results feel dated -- legacy links that are still hanging around from days when links were still used that way.)

Feedback effect: Thanks to google, ten sites tend to be more important than any other sites on any given topic. This results in certain sites becoming increasingly important. Wikipedia is the chief example here. Why is there only one Wikipedia and not a dozen? Chiefly because it's gotten all the google juice. If you want your wiki article on foo to show up in google, you naturally write it on Wikipedia, not Fooipedia. The result here is that all google searches feel increasingly the same -- of course Wikipedia is always in the top ten, or maybe something like Stack Overflow for a technical search.

----

So, these days, if I don't see something interesting in the top ten, I often click on the link to page 10 (or 20, or 100) of the results. Often more interesting. For example, google for "mashed potatos".

Top 10 results: "Perfect mashed potatoes" (SEO), allrecipies.com (always in top 10 for any recipe search), foodnetwork.com, Wikipedia, about.com, nytimes, etc. Pictures of mashed potatos. All generic and useless.

Page ten results: Dairy-free mashed potatoes. _Potato_ free mashed potatos! Caramelized Onion Horseradish Red Mashed Potatoes! A poem about eating them. At least marginally more interesting and quirky. What I would have expected out of google circa 1997.

[+] skolor|16 years ago|reply
I'm not quite sure what exactly it is you would want from them. http://www.google.com/search?q=mashed+potatos has a list of recipes for the query. Seems like exactly what you would respond to the question "What do you know about mashed potatos?" If I changes it to "mashed potatoes" like suggested, I get the rest of the results you mentioned. Again, this is exactly the kind of stuff you wanted.

Now, if you want something "quirky", why are you searching for a generic term? What kind of "useful" result do you want from a search on mashed potatoes? If you give them a crappy search query, they should be giving you as generic of results as possible.

One thing I've found is that if you are looking for something specific, don't search for something generic. If you wanted something "quirky", why didn't you do "mashed potatoes quirky"? Then you get a restaurant that features mashed potatoes heavily in their recipes, a carmelized onion mashed potato recipe, a mashed potatoes festival, several more "interesting" recipes, and a book called "Grinning in His Mashed Potatoes".

It sounds to me like the results have improved, not gotten worse, if you aren't getting a poem about mashed potatoes on the first page of search results for just "mashed potatoes".

[+] pfedor|16 years ago|reply
The results from Experts Exchange are typically useful, but you have to scroll all the way down past the ads and other crap to see the actual answers.
[+] alecco|16 years ago|reply
A lot of non-technical people lately seem to type whole questions and sentences in the search box. Syntax analysis is hard for them, it seems. But Google now encourages this and levels everything down.
[+] Retric|16 years ago|reply
So, do you think it's time to start adding a top links and help them out?

I have started tailoring my searches in odd ways to help them out. Ex: Adding a the year when I want current results. But, without useful links it's all GIGO.

[+] bhseo|16 years ago|reply
SEO: When you cut through all the BS, the entire goal here is to make a less good match come first.

That is not the entire purpose of SEO. There's good sites out there that don't provide their content in a way that can be indexed by spiders. SEO often solves that. There certainly are plenty of people making bad websites and trying to make them rank, it's Google's job to weed out the useless information.

People still collect links (shameless self promotion: http://internetmindmap.com ), they still have huge blogrolls, there are human powered search engines, a vast amount of directories for every imaginable niche...

Google doesn't need the perfect page of 100 links and I doubt it ever did.

Your mashed potatos example does not make sense. Google gave you generic info for your generic search query. How is that bad?

Now the fact that certain sites dominate a very wide range of search queries, is an interesting point. Personally, I would just add a sidebar or something similar, to be occupied by the "staple" sites, such as wikipedia, about.com etc.

[+] Erf|16 years ago|reply
Yes.

One thing I noticed is that searches no longer require that all words in the query be present in the search results. Adding a + before a word is now required to ensure that it's present in results. That frequently results in me having to do 2-3 searches to find something that could previously be found with one.

[+] huangm|16 years ago|reply
Another annoying snag is that if you search for "A B", google will also search for "AB" (eliminating the space). This affects a lot of searches with acronyms and technical terms. For example, if you're looking for info on MIT's RAs, the top search results for "ra mit" or "mit ra" are related to "ramit" or "mitra".

This seems to be an optimization for their average user, but is really inconvenient for people searching for system errors, mathematical/cs theory terms, or other queries where acronyms are common.

[+] klon|16 years ago|reply
I have noticed this too and find it very irritating. I expect ALL words to be present. Why did they change this?
[+] protomyth|16 years ago|reply
Even with the +, google makes some interesting "interpretations". I notice when I search on one of the BSDs (e.g. OpenBSD) it seems to pick pages that just have BSD on it.

Plus, google's handling of punctuation (e.g. f-script) is a pain since (even with the +) it will do weird substitutions and consider blanks good enough.

[+] skolor|16 years ago|reply
Since I started using more advanced search features regularly I have gotten significantly better results. Things like +"my search term", -free, -download, -cracked tend to heavily limit the spamming of results, and if I want something specific using tricks like inurl:keyword and site:siteToSearch tend to make what I want just jump out on the first page.
[+] alecco|16 years ago|reply
Also the - minus comes handy to exclude similar terms or phrases. Yeah, this feel just like Altavista.
[+] pqs|16 years ago|reply
Thanks for the tip of the + !!
[+] mahmud|16 years ago|reply
Yep. I never used to use search operators unless I was looking for something really specific. Nowadays I get completely irrelevant results and I am forced to quote strings and explicitly specify term precedence, conditionals and other regexy stuff.

Last night I was searching for the syntax of DEFTYPE when used with various types (i.e. MEMBER, SATISFIES, OR, etc.) and the #1 his for "deftype member" was the personal MySpace page of some guy.

I think they're optimizing for "social" results now.

[+] NikkiA|16 years ago|reply
They'd probably do well to have a seperate 'technical search' for searching for things related to technical matters, eg programming languages, physics, chemistry, medicine, engineering, etc. And remove the casual stuff (facebook, myspace, pages that are clearly not about a technical subject, etc) from that index.

It would probably be a highly praised feature to seperate off a second index like that, as specifically searching for programming language concepts and documentation can be difficult (the C# and .NET problem).

[+] Skeuomorph|16 years ago|reply
It's also aggravating that quotes don't allow matching of specific symbols within the quotes.

And, yes, the results are different. I'd agree with you they seem enhanced (eg., classified results) and fresher for popular culture, but somewhat worse for domain specific queries.

[+] coliveira|16 years ago|reply
Wait until they integrate twitter. It will be 99% real-time crap and 1% historical results... Google should just stay away from real time frenzy, or maybe fork the search engine to avoid disrupting the good results it has.
[+] asyazwan|16 years ago|reply
Can't agree more. Let's hope in the future we can choose to search for tech-specific results.
[+] abstractbill|16 years ago|reply
Just out of interest, do you find lispdoc.com useful generally for this kind of thing?
[+] epi0Bauqu|16 years ago|reply
Yes, and this is why I started Duck Duck Go: http://duckduckgo.com/. Thanks for posting specific cases--they help me immensely. Anyone else have more?
[+] catzaa|16 years ago|reply
I've used Duck Duck Go a bit and the results are okay. Maybe just a shorter URL would be awesome (such as ddg.com ddgo.com, etc...).

Here is a search that I have had problems with :

octave "--eval"

Your site does pretty well with this (fourth link is somewhat relevant).

[+] dragonquest|16 years ago|reply
I really have to applaud your efforts on that front. I've been using duckduckgo on and off for some time now and I must say I'm mighty impressed. The best part is the keyboard style navigation and the big clean look. Always retain those two. :)
[+] Adrenalist|16 years ago|reply
I was searching for a way to enable/find the chat logs for Microsoft Communicator which we've recently switched to at work.

Google was basically filled with dead end forum postings and SEO spam.

DuckDuckGo was more helpful and brought me to the MS TechNet article with full documentation on MS Communicator Policy configuration.

Bing was surprisingly the most helpful and brought me to the Communicator Team posting from 2008 which shows me where I should have been able to find the logs.

It looks like my work has blocked/disabled this feature on a global setting even though I haven it enabled locally.

[+] nkurz|16 years ago|reply
There is one change that gets me often. Using a hyphenated-word used to require that the two words occur in order, although it would also match the two words joined together. It also used to turn off stemming.

Previously it was equivalent to ("hyphenated word" OR +hyphenatedword). But now it seems to behave almost the same as the unquoted (hyphenated word).

Just to make matter worse, when I tried out my example just now I found that the first result (a wikipedia page) for "hyphenated word" doesn't even include the phrase!

[+] hotpockets|16 years ago|reply
I hate searching google for technical jargon (and its worse the more technical the jargon is). Usually google just gives you a bunch of academic papers (in case you are wondering academic papers do not actually explain anything, they just use a bunch of jargon in a plausible way). Gee thanks. If you want I will think up an example.
[+] raphar|16 years ago|reply
One feature I use a lot from google is the cache, just to avoid websense at work.
[+] thaumaturgy|16 years ago|reply
Wow, that's amazing. I think I'll be using this a little more often.

Notably, you return a result for a specific osCommerce error message that I wrote about a while back (!); Google doesn't even know I exist.

[+] BearOfNH|16 years ago|reply
I've been using duckduckgo; very nice. One UI comment: when I click on "More Links" I'd like some kind of cue -- say, a horizontal line -- so I know where to start looking when the results come back. I frequently click for "More" then while that's loading I go look at another tab. When I return to Ducky it's sometimes difficult to separate the new results from ones I've looked at (but not clicked on) before.

You might try to track page age; some of the results I get are from 5 years ago and as noted by others, aren't always useful today. But that's a harder problem for another time.

[+] NikkiA|16 years ago|reply
It has certainly gone waaaay south since day 1, that much is undeniable. Of course, a large part of that is that the internet has gotten a lot more useless filler in those years, and this has of course made 'relevant search' an astronomically harder problem than it was in 1998.

On the more important metric of 'has quality gotten worse in the last couple of years', I would say 'sort of'. The direct quality of results HAS suffered, but on the other hand, google have implemented the user-wiki thing that allows you to modify, to a degree, which sites are less relevant.

I will add that I think google needs to rethink it's keyword fuzziness, in the past it used to be acceptable if the results didn't exactly match what you were searching for, but these days that is becoming more of a problem. If I search for a bunch of words, I typically know I want those words, by all means suggest 'did you mean ... ?' but the fuzziness in the results needs to be pulled back.

[+] robk|16 years ago|reply
Back to the Altavista days of +search +must +include +plus +signs +"and quotation marks"

Seriously, the problem is advanced users get unexpected results with the query expansion and refinement layers they've added on. While dropping obscure words is helpful when grandpa has misformed queries, it's maddening for a technical user looking for a very specific, infrequent keyword. However for grandpa, it's probably a better experience for a generalized result.

Using +'s works well enough, but it's disappointing we have to use a less efficient method of querying now.

[+] rw|16 years ago|reply
More generally, I think that we don't have a good mental model of how Google is searching. It used to be straightforward, almost like Git, which won't make any "clever" merges. Google, as a tool, is on the decline because it's not comprehensible (even to us advanced users!).
[+] niyazpk|16 years ago|reply
Quote from Google employee:

It does work as described! - which may very well not be as desired. If you want a more mathmatical-logical use of operators then you need to go find another search engine.

http://www.google.com/support/forum/p/Web+Search/thread?tid=...

I am afraid that Google is now more of a 'social search engine' than a 'hacker search engine'.

[+] MattCutts|16 years ago|reply
I think the the results quality on Google has been the same or better over the last few months; one thing we have been looking at is helping less savvy users who might mistype a word or type extra words that they don't really need in their query. That can be a little more annoying for power users, but on the other hand the power users pick up tricks like "Use a '+' in front of a word to require Google to match that word."

Regarding the query [Linux asynchronous IO] returning older results, here's a tip. Above the search results click the "Show options" link to open up what we call "toolbelt" mode. From there, you can click to show only results from (say) the last year, or in a certain date range.

Toolbelt mode is really handy, e.g. if you search for a product, you can click "Show options" and then click the "Fewer shopping sites" link to get more reviews and manufacturer pages instead of comparison shopping sites.

[+] indiejade|16 years ago|reply
Did you try http://www.google.com/linux

?

Although I alternate search engines regularly, I do think the way Google indexes for specialized searches is pretty smart.

[+] VladimirGolovin|16 years ago|reply
There was a comment on LessWrong where a poster suggested using Yahoo over Google for non-quoted search. I tried it myself (I need to find an article based on several words mentioned in it) -- and was pleasantly surprised. Yahoo seems to be doing much better job than Google for non-quoted search requests.
[+] petervandijck|16 years ago|reply
I tried Bing the other day out of curiosity, and the results where incredibly dissapointing compared to Google's.
[+] garply|16 years ago|reply
Did you try their image search or product search? I've almost completely dropped those two parts of Google due to Bing.
[+] ottbot|16 years ago|reply
I thought so too, Bing seemed very poor for the technical related searches I made.
[+] mattlanger|16 years ago|reply
I've actually noticed the inverse problem with some non-tech-related searches. Politics is a fine example: try and find an article about a 2002 House vote on Social Security and there's no chance you'll find it; the results are all present day.

Strange that with regard to current events the web seems to have little historical memory, and yet with regard to current technology it has too much.

[+] mojonixon|16 years ago|reply
The autocorrect is maddening. I've been researching Riak recently. It's new so there isn't a lot available. Paired with another search term I frequently get results only for "risk." Let me know if I might have made a typo, but don't assume I'm an idiot and do something different than what I told you to do.

Dropping keywords also annoys me. If the keywords don't exist then tell me that so I can adjust my search. Don't give me a long list of results that I have to click through before realizing you screwed up the search.

The only reason the google search bar is still my default is because I use it as a quick and easy calculator.

[+] webwright|16 years ago|reply
Absolutely-- especially when you drift out of the world of technology. Google is based on the "linkerati" (hat tip to SEOmoz) - geeks and bloggers who link to stuff aggressively. That works great in the worlds where people link (social media) but poorly if you're searching for non-geeky stuff.

I've been doing home remodeling a bit lately, and it's clear to me that there are NO home remodeling linkerati. But there are plenty of SEO guys out there and it only takes a few low quality links to top a lot of searches. So search for home remodeling stuff and you see plenty of adsense spam.

[+] utnick|16 years ago|reply
its more the internet has gotten worse I think

They should drop yahoo answers, any site with an affiliate link, and any of the internet marketer sites ( like ezinearticles.com ) from their index

[+] qeorge|16 years ago|reply
Not overall, but I have found Google frustrating for things that are very recent. For instance I wanted to watch Obama's speech to the schoolkids on YouTube (the day after he'd given it), but the only videos Google would bring back were from his race speech in Philadelphia. I tried all the operators and keywords I could think of, and couldn't get what I wanted.

FWIW, Bing nailed it on the first query.

[+] rg|16 years ago|reply
It's eye-opening to try "Blind Search", which submits your query to Google, Yahoo!, and Bing simultaneously, and displays to you the three sets of results without (at first) identifying which is which:

http://blindsearch.fejus.com/

I was amazed to discover that I was consistently choosing the blind results from Yahoo! as best for my own searches.

[+] jrockway|16 years ago|reply
I think you picked two bad terms. "Linux asynchronous IO" has meaning more than just "IO under Linux that's not blocking". That's what most pages using those terms are referring to, so it makes sense that that's what Google would give you.

"Hashmap", the term you searched for, is a term popularized by the Java world; I think most C programmers still call them "hashtables". (Hash map is a better term, as many excellent map implementations aren't actually based on hashes; see "Judy" for example.) A search for "C concurrent hashtable" gave me a lot of useful results. (You are also suffering from C's lack of a coherent community here. Lots of people write this sort of thing, but few think to share it. Hence, not many search results.)