top | item 29886423

Is Google Search Deteriorating? Measuring Google's Search Quality in 2022

470 points| echen | 4 years ago |surgehq.ai | reply

414 comments

order
[+] lbriner|4 years ago|reply
One thing I find annoying is that they still return results that are those sites that seem to register a load of terms that all point to the same page. You see this with telephone numbers, song lyrics etc. where the result looks like "Lyrics for Stairway to Heaven" but you click through and there is no content, just a page that says "Upload some lyrics to this song".. etc.

These sites should be heavily penalised for click-baiting and they have been doing it for years.

[+] cik|4 years ago|reply
I've found my result quality has gone up dramatically now that I use extensions allowing me to block domains from google search results. It seems silly that I ought have to do this, but Google has finally become useful again as a result.
[+] amelius|4 years ago|reply
Yes, give me a downvote button on Google results.
[+] 1vuio0pswjnm7|4 years ago|reply
"These sites should be penalised for click-baiting and they have been doing it for years."

If SEO works and a result appears closer to result #1 in the SERPs, then the "true", non SEO-assisted result it is displacing would appear further from result #1. Apply this across the board and what we have are many, many non-SEO results that are pushed down in Google's ranking. No one is "penalising" these pages, however they suffer visibility problems because they have not engaegd in SEO. The incentives created by Google's secretive ranking system and online advertising commercial focus are perverse or at least in conflict with the user's goals. Google discourages and even prevents any user from looking at results that were hits but were not ranked high. Pages that may not succombed to the the influence of such incentives may "disappear".

What if a user understands this and wants ignore the Google ranking system. What if the user wants to see the true, non-SEO results. Google actively limits the user's ability to see those displaced results. For example if a user searches for a common term, such as "example", she will not be able to view more than 200-300 results. Elsewhere in this thread someone also noted even with a paid API, Google limits users to 1000 results. If the user wants to the see the full range of pages that have hits for the word "example", she cannot do so. If the user would like to perform a single search for all pages containing the term "example" and then sort by some other objective criteria such as alphabetical by domainname, date, page size, etc., she cannot do so.

Under Google's model of the web, pages that do not acquiesce to an online advertising company's secretive ranking system may become nondiscoverable, despite the fact that they may indeed match the user's query. Computers assist us in searching through data but "relevance" is ultimately decided by the user. That is why we can have HN threads that claim search result quality is declining. Though they may be slower, humans can determine relevance better than any computer. From the disclosures of Matt Cutts and others we know that humans are involved in Google's ranking implementation. Penalties are used. The search process is not 100% math/computer-based. However, in Google's model of the web, filtering results is the exclusive domain of the online advertising company and only the humans on its payroll, not the user performing the search. There is no option to disable the online advertising company's "assistance" in filtering.

[+] marginalia_nu|4 years ago|reply
Ironically, my work on my own search engine has led me to be a bit more patient with Google's problems. At least I think I understand them better. Search engines fail in weird ways.

I think in part that Google just has gotten a spectacularly confusing failure mode. If it can't find good matching contents, it starts second-guessing your query and producing other results, which makes you think it's not even considering what you entered. It may even be "better" in the sense that it's more likely to return at least something relevant, but in practice it's bad UX because it's so unintuitive what's happening. It's probably one of those unfortunate optimizations that are invisible when they work and frustrating when they don't.

There is so much stuff on the Internet it's easy to start thinking there is guaranteed to be good results for any search, and that just doesn't seem to be the case. Especially with highly specified searches with 6-8 terms, you quickly enter the domain where you're reasonably unlikely to find an exact match.

[+] elondaits|4 years ago|reply
When I search for "databricks series b valuation" in Google (from Argentina, using Google.com in English) result #6 is:

"Python get value from database - Büro Jorge Schmidt", which judging by title and preview seems to be a Python + MySQL tutorial. It returns a 403 error and might be a hacked site, since the home page is for a graphic design studio in Munich.

Result #8 is something similar:

"Intellij flatten packages - Músicos de Viaje". This is definitely a hacked site (from Spain, apparently) that redirects me somewhere else.

Result #10:

"How to calculate tax percentage in sql query". Another hacked site, this time for an evangelical church from Brazil.

Now... how can Google think that any of these sites are relevant? Even if it doesn't realize the pages are hacked... even its crawler has been fed content that included the keywords... :

A - The sites themselves don't match the query at all.

B - No legit site about the subject would link to these sites.

C - The results themselves (title, url, preview), as Google shows them, have nothing to do with the search!

[+] creato|4 years ago|reply
I just tried that search, all of the results look relevant and I definitely don't get any of the results you are getting.

I wonder if you have some malware that is hijacking the results? I once had some malware (chrome extension) that was corrupting my search results. It was surprisingly difficult to remove (given that it was a chrome extension...).

[+] pverghese|4 years ago|reply
Just tried that search in English. First result is "The data- and AI-focused company has secured a $1.6 billion round at a $38 billion". And if you click the search box you get the first search option as databricks valuation history where the first result is funding every round.

Google has been so much better for search for me than other search engines. Atleast for what I search for programming, news etc.

[+] freediver|4 years ago|reply
The article mostly talks about IA (instant answers) which are notoriously hard. The recent advances in machine learning have made the technology more approachable, so startups like Kagi Search (disclaimer: founder) can also leverage latest advances in NLP and compete on this ground.

To give just a few examples:

Query 1: how many stars in the usa flag

Google: https://cln.sh/63sVzh

Kagi: https://cln.sh/bFEHsD

Pretty surprising that Google would get something like this wrong.

Query 2: when did moon explode

Google https://cln.sh/fUhdJS

Kagi https://cln.sh/5wDvXG

Both engines feature the same article but for some reason Google decides this is not fiction, and gives a (wrong) answer.

Query 3: do most rabbits have short or long ears

Google: https://cln.sh/JuOeqq

Kagi: https://cln.sh/BkZi6O

Both engines use the same article for source, but Google completely misses the context.

These examples show that a search startup has a chance to go neck-to-neck with Google and compete even in technology as sophisticated as instant answers. We invested considerable resources in the Kagi Search AI capabilities, discussed in some detail here https://kagi.ai/last-mile-for-web-search.html

What is mind boggling though from a product management perspective is that Google had nearly a decade head start and a cash purse of hundreds of billions of dollars to get this right.

To be fair, it is likely that the vast majority of queries are answered correctly, but only the outliers get the public attention. Also Kagi is not without its own share of silly mistakes too, but just being able to be considered in the same basket as Google is already a huge thing for us.

[+] klondike_|4 years ago|reply
I think that Google is optimizing for the "average user" to the detriment of power users such as the HN crowd. Most people treat Google as an internet oracle and send queries like "how do I do X" while power users will search for keywords. One example of this optimization is the automatic answer boxes that show up for certain questions, which are wrong disturbingly often or don't include important details.
[+] ghosty141|4 years ago|reply
The average use absolutely does this. I see this with family members and friends. Most just type in full questions.

From my experience the best way to get good results is to start typing keywords for yourr question and then creating the query based on the autocomplete results. If I notice I don't get autocompletion for a certain query I'll restructure it until I do. This has proven very useful in providing good results.

For technical stuff, using the quotation marks is almost essential.

[+] User23|4 years ago|reply
I miss Altavista. I could generally either find exactly what I was looking for inside of three iterations of refining the search, or find that it wasn't to be found.

I still just want a blazing fast full text search of the reachable WWW that understands regexes and a basic predicate calculus. Unfortunately the overhead and small potential user base means that under the current regime such a thing will never be made.

Speaking of, if any government actually wants competition, they don't need to break up Google, they just need to force them to offer full access to their cache and compute at some reasonable rate, much like how the ILECs were made to carry the CLECs' traffic.

[+] pictur|4 years ago|reply
I think it's the opposite of what you said. non-expert users search more precisely with longer sentences.
[+] achairapart|4 years ago|reply
This. And let's not forget that is the "average user" that mostly naively clicks on ads, not the power user. And selling ads is still Google core business.

Looks like Google is slowly turning into a big nigerian scam.

[+] Gigachad|4 years ago|reply
I think this is overall a good thing. Power users have trained their behaviour to work in the way that simple systems can deal with. While average users ask the question exactly how they would ask another human. Google has now reached a level where it works best when you deal with it in a natural and human level.

There is nothing actually better about the way we originally used search engines, it was just required at the time.

[+] wolpoli|4 years ago|reply
The web itself is deteriorating.

Instant answers (IA) caused a shift in the way contents are written. Content optimized for IA tend to be repetitive and shallow. Viewing content written for IA is a frustrating experience and these tend to dominant the result page now.

[+] CLLD|4 years ago|reply
It's definitely deteriorating, and the worst part is that it completely ignores quotes if it thinks you meant something else, and shows the results for what it thinks you want. Completely useless in a lot of cases
[+] IAmEveryone|4 years ago|reply
Google corrects spelling in the way you mention, but removes quotes only if there are zero results for the search with quotes. Example, finding one result; https://www.google.de/search?q=%22but+when+therefore+is+an+a...

Getting this wrong is probably why people think we need more than definitive assertions from people operating from subjective impression.

Changing a letter in the query, it says, right at the top:

    Showing results for "but when therefore is an adverb"
    No results found for "but when therefore is tan adverb"
[+] kebman|4 years ago|reply
If you search for anything political on Google, you'll notice that the results are clearly slanted in one direction, towards the opinion of a handful of pre-approved news outlets. This leads me to seek alternatives whenever I need neutral sources, for instance Yahoo search.
[+] photochemsyn|4 years ago|reply
I wonder if they're getting cash kickbacks from established corporate media outlets for pushing their material to the top of the search results. That would actually be less creepy than if its being done as some kind of information manipulation program.

It's high time Google and other search engines were forced to expose the inner workings of their ranking algorithms to the public, particularly now that they have near-monopoly power in the sector. People should also be able to adjust the dials on the algorithm themselves.

[+] GuB-42|4 years ago|reply
I think it is related to the fight against "fake news", "hate speech", etc... People don't tolerate a truly neutral search engine, because it will reflect human nature and human nature is not always pretty. I remember the time when Google returned antisemitic websites when searching for "jew", they refused to do anything about it because "jew" is used mostly by antisemites and therefore, an antisemitic website is what people searching for that term most likely want, the search engine did its job. I don't think it will fly today.

So search engines now have to get the "truth", preferably the politically correct one, and since you can't rely on the crowd for that, you have to introduce bias, and "pre-approved news outlets" are the most obvious choice.

[+] narrator|4 years ago|reply
Try "what countries are using ivermectin" in google.com and then try Yandex.com. For me, the third site on Google (the kitchen sisters) appears broken and the rest are all some variation of "why ivermectin is bad" articles. Yandex actually answers the question.
[+] remus|4 years ago|reply
I find these responses fascinating as the "clearly slanted" results tend to change direction depending on the political affiliation of the person making the claim! Having said that, I'd love to be proved wrong if you have any evidence to show a particular bias one way or the other?
[+] dustintrex|4 years ago|reply
As a corollary, search on Google News (as in, browsing to news.google.com and searching there, or !gn via DuckDuckGo) is really bad. The index seems to update really slowly, so breaking events are usually missing entirely, and the grouping of articles into single events is also quite broken.
[+] JohnJamesRambo|4 years ago|reply
Sounds like you want non-factual answers.
[+] 6510|4 years ago|reply
That strikes me as a wonderful idea for a new search engine: Just politics. Authors and sites could create a mini profile to refine the results.
[+] lvs|4 years ago|reply
Oh give it a rest.
[+] andrew_|4 years ago|reply
The most frustrating part of using Google these days (for me anyhow) is Google returning results that don't match terms that I specifically wrap in quotes. If I search for:

"gamakatsu octopus hooks"

I expect to only receive results for that. Instead I get bombarded by results that match a portion, or when Google thinks I tangentially might have meant something else. There was a time when it respected the quote characters, but those days have long since passed.

[+] anont094h0|4 years ago|reply
What's galling is that they've actively gone out of their way to make it worse, instead of just letting it regress through neglect.

For example, a few weeks ago, I image searched for a meme that I created years ago on 4chan. A dozen or so results were returned, none of them relevant. But if you tack on the name of a 4chan archive, for example "4plebs" (not even "site:4plebs..."), all of the sudden it turns up.

Google in general seems to penalize 4chan and its archives, which is ironic since it's one of the few places where actual humans post OC. Meanwhile Pinterest spam, AI-generated blog posts, and reddit threads full of bots and shills abound in its results.

[+] anont094h0|4 years ago|reply
Speaking of 4chan and google's declining result relevancy, a particular instance of the latter was discussed there (one of the few places it could be discussed, given the amount of censorship that prevails everywhere else these days):

https://desuarchive.org/g/thread/76372135/

This is still the case today, at least in the US (I just checked). Instead of emphasizing the painting of Beethoven we all know, the one that was actually done during his lifetime, the one featured in the infobox of his Wikipedia page (which is also the top link result), it instead emphasizes a much more obscure painting that was done posthumously, for no obvious reason other than it giving him a noticeably darker skin tone. I'm not even offended by it, I just find it ridiculous that Google actually went out of its way (probably for pc reasons) to train their algorithm to return less relevant results.

[+] ramoz|4 years ago|reply
When I search “tim lee food blogger age” Google actually shows results with “age” striked out (so it shows top results as if age wasn’t part of the queried string).

Trying to think why/how it’d conclude that age wasn’t necessary for good results.

[+] nspattak|4 years ago|reply
Googles search's results are often wrong because of corporate choices. In Greece there is a completely independent news web site which for some reason just isn't registered as a news web site with google. As a result not only is this web site less shown than others (in google feed or in search results) but in the past there have been cases where this web site was the first to publish a story but google search only returned other news sites which reproduced this story even using the original material!

In my opinion, google has become too big and has lost focus on actual quality/engineering.

[+] halpert|4 years ago|reply
I believe the theory that Google is optimizing for ad revenue. Sites without ads get ranked lower. The biggest example I can think of is Wikipedia. When I search a proper noun with a Wikipedia article, I almost want to go look at that article. Recently, I feel like I really have to dig for it.
[+] Hard_Space|4 years ago|reply
I share the view that Google SERPS have dropped in quality the last 5-7 years. Of great annoyance to me is the amateurish way that a search results page will find relevant Twitter results but then clicking on the results takes you to the root page of the Twitter user and not the result. Since many Twitterers are prolific posters, it can be very time-consuming or even impossible to find the result listed. Thankfully Inoreader takes me to the exact Twitter result.
[+] pyrrhotech|4 years ago|reply
One of the worst things I've noticed recently about Google Search is how it is very anti-startup because of the concept of the Google Sandbox, an essentially arbitrary length of time they put a huge negative penalty on your site to try to entice you to buy paid ads instead before your funding runs out waiting for organic traffic.

Perhaps that's my biased opinion on their motivations as I've recently launched https://grizzlybulls.com and yet even though Bing has tiny market share, I'm getting 10x more organic traffic from Bing rather than Google...

[+] RandyRanderson|4 years ago|reply
One reason it might deteriorated is that goog is constantly battling ppl 'optimizing' their content for Google while competitors likely see less than 1/1000 of this.
[+] greyman|4 years ago|reply
I don't know if search is deteriorating as a whole, but certain searches seems to be manipulated for political reasons. The famous example is an Image search of "white couple" - really, try it, it is like only 50% correct. But I don't believe the image search itself would be that bad, rather certain queries are given manipulated results.
[+] fdgsdfogijq|4 years ago|reply
One theory is that Google has made a substantial change to a neural network based search, and they are still working out the kinks in getting it to work. How could it not be A/B tested such that we wouldn't notice the bad searches? The answer to that I am not sure. I read their research publications, and the NLP research coming out of Google is far beyond any other company. I can only imagine what they aren't publishing.
[+] hooande|4 years ago|reply
The methodology in this article is terrible. It makes me doubt that the people at Surge HQ understand even the most basic scientific concepts.

This is like doing a taste test between two sodas where one is clearly labeled "Coke" and the other labeled "Pepsi". It will end up measuring branding and public perception instead of anything empirical or even objective.

This isn't a measurement of search quality, it's a public opinion poll with a sample size of 250. In fact the whole thing is a poorly disguised advertisement, and I don't think it serves them well.

[+] AlbertCory|4 years ago|reply
I've been gone from G for 4 1/2 years now. When I was there, the weekly meetings often featured "search quality" measurements that were rigorous in their objectivity (I thought). They bent over backwards to be non-self-deluding.

I distinctly remember Udi Manber saying "if the web is slow, it's our fault" (actually, the speech was that everything is "our fault"), meaning, really, "take responsibility for problems and don't throw up your hands."

However, the natural tendency of any organization is to reward the suckups and promote mediocre people who just get along with everyone. It wouldn't surprise me if that's what's happened with Google, too.

[+] echen|4 years ago|reply
Yeah, when I was at Google, the SERP (and caring about users) was a shrine at all levels. Here's an example of one of the weekly Search meetings AlbertCory is talking about: https://search.googleblog.com/2012/03/video-search-quality-m...

For example, my first week at Google/YouTube, I was in a New Hire meeting with our VP. Someone asked about profitability, and he responded that Larry said we didn't have to worry about revenue yet, since the main goal was user growth/happiness, and revenue could come later. Which I thought was fascinating, considering how big YouTube already was at the time (in 2013)!

Though I think this changed a year later, and I find YouTube ads a poor experience compared to Instagram and TikTok -- which aren't merely "better than the rest", but stuff I actually enjoy watching.

[+] literallyaduck|4 years ago|reply
Go to image search and type in "white people" without quotes.

Now repeat the search but with "black people" without quotes.

Try the same for "white couple" without quotes.

Try the same for "black couple" without quotes.

Do you have any insight to this phenomenon?

[+] whakim|4 years ago|reply
It seems weird to me that Google would take "search quality" seriously for 15+ years (including nearly a decade as one of the biggest companies in the world) and then suddenly stop. Are you able to share any of those objective measures of quality? Because it seems to me that most of the discussions around the declining quality of Google search amount to anecdata backed up by reasoning that doesn't really make a lot of sense to me (e.g., "Google only cares about short-term revenue!")
[+] bhawks|4 years ago|reply
I think just attributing this to the right people not being in charge over simplifies the problem.

Search is a complexity beast and simply continued to grow in complexity during the several years I worked directly on it. Folks were proud of the fact no one could even enumerate all the features in the system (attempts were made and abandoned).

The tools to change search safely werent keeping up with the complexity of the system. Understanding impact with evals and experiments became much harder. Gwsdiff and friends grew flakier. Debugging had so many different entry points depending on what you needed to do.

The search stack deserves some really deep cleanups and refactoring, the eval and devtools are similarly in need of a ton of love.

[+] foxfluff|4 years ago|reply
> the weekly meetings often featured "search quality" measurements that were rigorous in their objectivity (I thought).

I wish this part got discussed, but every time I've attempted it, the discussion has been shut down by "lol they're experts at search and you're not and you don't know what they know."

I wouldn't put it past Google to be blindsided thinking their own metrics are objective (perhaps they are objective measurements of something but not of what they actually want to measure). If anything, the battle with SEO just shows how hard it is to do something right and avoid getting gamed. If they can't rank SEO spam off the front page, why would I believe their measurements are any better than the rankings?

There's also always a small possibility that metrics are worse than wrong; they could actually say everything is fine, keep serving these long form SEO spam articles that people click and read for far too long before realizing it doesn't have have the answers they seek.

[+] foobiekr|4 years ago|reply
But slow/fast isn't quality. What were the rigorous measurements? Latency of results to click? User clicked above the fold?

I have wondered about this now for several years, because from my point of view Google search has steadily degraded since around 2010. The specific degradation isn't that it returns nothing, or irrelevant pages, but that it returns mostly _recent_ content, and returns very little _older_ content which is still assuredly on the web.

I can see how Google's approach will work for the average search query - after all, the average query is probably about something happening now - but that doesn't equate to _quality_.

[+] fnord77|4 years ago|reply
just an anecdote, but I see quite a few people in my linked in network who are now at google within the last few years. People I wouldn't consider "google quality".