top | item 36564042

Google search's death by a thousand cuts

313 points| rckrd | 2 years ago |matt-rickard.com | reply

319 comments

order
[+] hooby|2 years ago|reply
I feel that Google search has been slowly dying for many years now.

Where in the past I could find documentation, forum-posts, wikis - concise information - these days I find SEO-optimised marketing fluff, link-farms, clickbait, long-form articles with thousands of words that ultimately say nothing, political tirades that don't make sense for any non-U.S.-citizen, and ads - tons and tons of ads.

It would seem that it was Google ads and Google search's own algorithm - which directly led to this state of affairs.

And that's the main reason why asking some LLM for answers is so much more attractive than using search: the AI cuts through all that crap, and extracts the actual information. If it's not hallucinating, that is.

The web is no longer for humans or for finding information. It's for bots, crawlers, for the Google algorithm and for marketing.

Humans simply can no longer digest it - we need AI to generate something more concise, more to the point, more digestible. Without the marketing, fluff and bait and filler.

[+] joelanman|2 years ago|reply
They very much killed the golden goose. It's a real failure to think in terms of systems. Google relies on a healthy open web, yet as arguably the biggest single influencer of the web it has completely failed to maintain that let alone improve it.
[+] StrictDabbler|2 years ago|reply
The other day I searched for "canned cabbage soup". The mobile website locked into a mode I've never seen before.

It would only provide blogspamish recipes for canning your own cabbage soup linked through the tiled suggestion interface. Not a single text-style link was returned and none of the results were pre-canned soups. I scanned though page after page without getting anything else.

Oddly, when I searched "canned cabbage soup history" it switched modes and started providing normal results, including the product and Wikipedia.

What strikes me about this is that I was actively searching for a product I wanted to buy. If there's any business case for ruining Google with weird specialty searches it would be selling products.

Why is there a "provide blogspam" mode?

Also, I'm now asked if I'm a robot at least five times a day.

[+] sixothree|2 years ago|reply
Last night I searched for a phrase that was the title of a page. Zero results. I changed my search and what I was looking for was the first result. Google has really gone down hill severely.
[+] JohnFen|2 years ago|reply
> we need AI to generate something more concise, more to the point, more digestible.

This is what I don't want from a search engine. What I want from a search engine is a list of links to places that have relevant information. I don't want the engine to summarize or distill what those places are saying. Too much knowledge and understanding is lost by not doing that myself.

[+] jrnichols|2 years ago|reply
> Humans simply can no longer digest it - we need AI to generate something more concise, more to the point, more digestible. Without the marketing, fluff and bait and filler.

I think one of the fears some have about AI/machine learning/etc is that they realize their entire advertising & marketing world could become irrelevant. Users could get short, accurate answers without having advertisements in the way.

Kind of like how things were many years ago.

[+] bushbaba|2 years ago|reply
AI LLMs are just curating what content to train on vs indexing the www. It’s actually even more gatekeeper-y
[+] postmodest|2 years ago|reply
The entire internet has become a market for ads traded among corporations, and that the ads themselves are useless to 70% of the audience is meaningless because the market has its own momentum now.

The side effect is that the below-median brainpower of the 30% of the audience who doesn't know any better than to watch ads and click on them is now the driving metric for content.

In a world where programming revenue comes from ads, and only idiots see ads, the programming will cater evermore to the tastes of the most-idiotic viewer.

[+] wesapien|2 years ago|reply
Can you give me an example of something you can't find and I'll try to find it?
[+] superasn|2 years ago|reply
The funny thing is the site which should have been removed like the of Stackoverflow spam clones, or sites like Canva and Pinterest that make thousands of similar looking pages with slightly different heading are still allowed and rank on Google.

Also hate the top 10 pages whenever you search the best something, like best domain name registrar. I don't want to read a spam blog post with affiliate links, I wish google would show me the actual domain registrars instead (like chatgpt does when i ask it). Google has been gamed so badly and they have been doing nothing about it just because the spam blog posts contains their ads.

There are some tricks I learned on HN to use uBlock origin to filter these spam sites but Google really needs to fix this. There is only so much an adblocker can do to fix search. And right now all the useful content is getting blocked while the spam content is not only allowed but ranking on top of everything.

[+] xbkingx|2 years ago|reply
The moment I saw the term "SEO" it was like a stopwatch ticked on until search was dead. It used to be frowned upon to do little tricks, like keywords in a tiny, transparent font picked up by crawlers, but not seen by users.

When gaming search engines became a profession, the end of search appeared on the horizon. Guess we're headed back to web rings and link indexes (which will be consolidated, heavily monetized, and abandoned). If we're lucky, we'll be back to dialup BBSes by the 2040s.

[+] nikanj|2 years ago|reply
I always have a feeling they could get surprisingly far by having one person doing random searches, then tagging all the obvious spam to be deindexed. But Google is very adamant about never hiring a person to do a job well, if AI can be trained to do the same job badly
[+] tacker2000|2 years ago|reply
Yea why cant i just remove these spammy Stack Overflow clones. Its wasting me precious time whenever I need some answers. Seems like such an easy feature but yea Google already lost its way a long time ago…
[+] WirelessGigabit|2 years ago|reply
I googled something.

One of the results was "title (recommended)".

That should be enough for Google to ban the result.

[+] QuantumGood|2 years ago|reply
You can use the AdGuard Annoyances list in the uBlock Origin settings to block certain types of content. This includes blocking cookie scripts that pop up every time you opt out of non-essential cookies. It also has an unbreak filter list that, when used in combination with EasyList, can remove anti-adblock warnings .
[+] lucidyan|2 years ago|reply
> The funny thing is the site which should have been removed like the of Stackoverflow spam clones, or sites like Canva and Pinterest that make thousands of similar looking pages with slightly different heading are still allowed and rank on Google.

I recommend this userscript https://github.com/vladgba/Back2source for avoiding Stackexchange clones, it saved me a lot of time.

[+] sjkdfkjdsfhjkhs|2 years ago|reply
It's also not clear why Google killed Blogger.

More investment in Blogger with better linking between blogs (the way you can @someone on Twitter) and discovery (#tag searches allowing you to see posts from multiple blogs, curated blogrolls), more realtime/live logging functionality, easy microblogging functionality, etc. would probably have led to a lot more content existing on Google owned spaces, as opposed to being siloed behind Facebook and Twitter paywalls.

[+] benabbottnz|2 years ago|reply
> The funny thing is the site which should have been removed like the of Stackoverflow spam clones

Absolutely. I have a browser extension installed specifically to block domains that are this type of spam from my Google search results, and I’m adding to it almost weekly.

Come on Google, you’re seriously ranking that as front page worthy?

[+] redsaber|2 years ago|reply
> There are some tricks I learned on HN to use uBlock origin to filter these spam sites but Google really needs to fix this.

Oh I did use ublacklist which mixes blocking in both of the google search and the google image search. But curious what is the filter you have in your uBlock origin?

[+] DeathArrow|2 years ago|reply
Is there a good list of spammy sites for uBlock Origin?
[+] moneywoes|2 years ago|reply
May I ask which filter list you are using?
[+] hsjqllzlfkf|2 years ago|reply
This is Google's inability to innovate.

Google stated goal was the "organize the world's information". Nevertheless, they didn't come up with Wikipedia, the highest quality curated human-readable information repository. They also didn't come up with ChatGPT, arguably the first LLM good enough that can perform non trivial tasks of data recall and organization.

Google had early success with search. Then decided to cement their lead by just throwing money at "all the smartest people"[1] so that their competitors would have difficulty hiring. This put Google in the situation where it's better to spend its resources fighting to keep the advantage that they have (by controlling the Internet infrastructure; android, chrome, email, etc) than innovating.

Google is dead man walking, unless something very substantial changes.

[1] "all the smartest people", read, "academically successful but rule-obeying, unimaginative, and risk-averse people".

[+] foobarbecue|2 years ago|reply
Doing planetary science, I used to be in the habit of searching for image IDs on google. Usually particular orbital images of the moon which had been mentioned in a paper. There are of course specific archives to search for this stuff, but google used to give surprisingly good results. Now I get literally zero results when I search an image ID. What changed? Did they remove their index of the Planetary Data System? Did they remove the indexes of planetary science papers?
[+] shalmanese|2 years ago|reply
Google's death is deeper than just the corpus, their ranking algorithm is also deeply degrading.

A week ago, a search for < James Palmer manchester foreign policy > wouldn't return https://foreignpolicy.com/2017/05/23/i-love-manchester-but-p... on the first page while < James Palmer manchester site:foreignpolicy.com > would [1] [2]. This appears to be fixed as of today but this kind of kindergarten level search intent not being fulfilled would have been unthinkable for Google even just a few years ago.

[1] https://twitter.com/BeijingPalmer/status/1670904508191322112

[2] https://twitter.com/shalmanese/status/1670973880637493249

[+] thesuperbigfrog|2 years ago|reply
This is the Internet version of the tragedy of the commons: https://en.wikipedia.org/wiki/Tragedy_of_the_commons

So many people and groups want to profit with as little effort as possible that the commons (freely available data, open APIs, FLOSS software, etc.) is being overrun with little to no regard for the long term effects.

Data can be copied endlessly, but it has to be generated the first time and updated or else it becomes stale and its usefulness decays. If no one is willing or able to generate or update it, there is nothing good and the signal-to-noise ratio falls off sharply. Everything is noise and it sucks.

[+] not_your_vase|2 years ago|reply
Many years ago, Google was very useful without Reddit. There is a reason why they are such a behemoth: they actually used to create very usable tools and services. People wanted to use them, because their products were good. Search, Gmail, Maps, Translate (etc)... all of them are (were) gems.

Don't know if they let search rot or they broke it intentionally. But it got broken regardless of Reddit. Reddit has a lot of info I guess, but it is far from being the only website on the internet. However seemingly google stopped indexing more than the top 150-200 websites on the internet (and even those results are often lacking the searched words).

[+] monsieurgaufre|2 years ago|reply
I believe it was also caused by the silo-ing of the internet and that a lot of the « public space » is now behind a login.
[+] ramraj07|2 years ago|reply
Google search is broken because the internet has evolved in the AdWords era. Every person with a heartbeat has realized the monetary potential of making money from page views and SEO, so there’s a race to the bottom to capture the top spot for every search term in Google. Even if Google comes at it from an objectively user centric perspective this seems like an insurmountable problem to solve. But adding insult to the injury, post Sundar Google is just IBM/oracle at this point. So it’s just circling the drain at double the acceleration.
[+] x0x0|2 years ago|reply
They broke it intentionally for money.

The second they paid people to game their search engine (ie adsense), the entire thing was on the road to uselessness. There always would have been that incentive, but Google turbocharged it, with $33B of incentive in 2022.

[+] HelloNurse|2 years ago|reply
> Don't know if they let search rot or they broke it intentionally.

From the point of view of a shortsighted company this might be a false dichotomy because if ad revenue from search grows, search is fine: neither rotting nor broken.

You are assuming that Google treats users as customers rather than as an exploitable natural resource.

[+] danpalmer|2 years ago|reply
> What if Wikipedia started charging or restricting API access?

Wikipedia has a downloadable data dump that would cost almost nothing to serve to Google, and it has an organisational mandate to make that data available. If they decide to charge for access to that I'm sure Google can afford it. Let's not throw around completely unrealistic hypotheticals.

[+] andersrs|2 years ago|reply
> What if Wikipedia started charging or restricting API access?

Then my Puzzle Game Redactle (https://redactle.net) would be threatened much like what happened to GeoGusser being charged for Google Maps.

Related to the article: I've also had trouble ranking Redactle on Google because there are a few poorly implemented ad-ridden versions which are part of link farming groups. Many of them pretending to be my 'Redactle Unlimited' brand. Google loves a bunch of spam sites linking to each other more than thousands of links from authenticated users on Reddit, Facebook etc apparently.

[+] throwaw12|2 years ago|reply
> If they decide to charge for access to that I'm sure Google can afford it Sure, Google can afford it.

How about asking for 10% of revenue when Wikipedia data is used to show search results? Would Google agree or starts another rival to wikipedia and kill it after 4 years?

Systems are so complex that, it is very difficult to predict how they would behave given parts of system gets impacted by other constraints (Reddit vs OpenAI data scraping, Reddit's urge to make money, Reddit moderators protesting against rules, 2 day blackout extending to infinite, Google releasing LLM papers, which are impacting its own business through ChatGPT and so on)

[+] fidotron|2 years ago|reply
The true meta problem here is the age old question of how to fund services, including search, on the web. Google, Twitter, Reddit, Gfycat are all immediate variants of the same problem, which does not have a technological solution. The naive idea that all these things can be free, especially when promoted by those in an ecosystem as propped up as it is by ad revenue, is ridiculous.

Someone has to pay for these things to be developed and operated and typically when they do they get to call the shots, which leads to the modern UX disaster.

At least now the VC situation means we have a lot less product-dumping-by-any-other-name intended to destroy the market for legitimate participants.

[+] wslh|2 years ago|reply
Google problem now is not Reddit nor Twitter. Google converted in just a brand with really awful results and full of AdWords campaigns where it is not clear what marketers are paying for. It seems like they need to return to the original Brin and Page paper [1]. Google continue to do amazing things (e.g. AlphaZero, Project Zero) may be in the same way as Xerox Parc did amazing things in the past but nothing really new cames from the organization. BTW Google Workspace (e.g. Sheets, Docs, etc) is a good competitor in the office space. Good business execution, the search in Google Drive is awful.

[1] https://research.google/pubs/pub334/

[+] troymc|2 years ago|reply
Unlike the author, I'm not very worried about Wikipedia. Yes, they ask for donations, but that's not a sign that they're dying; donations are their main revenue source. That might be unusual among the web's top sites, but it's not in the grander scheme of things, i.e. there are lots of charities and nonprofits.

I didn't know that LLM training sets used Wikipedia. I would have thought that the CC-BY-SA license (on all text) would keep them away. It's not like Chatbot-4000 can cite all the Wikipedia articles used to train it, nor is Chatbot-4000 going to license all its output under a CC-BY-SA license (as required by the license terms).

[+] jerzyt|2 years ago|reply
Google's Search is dying because of Thousands of self-inflicted cuts. A couple of years ago there were about 3 or clearly marked ads on the first page, followed by generally useful results. Right now, I don't have the patience to scroll past the ads which are almost indistinguishable from useful links.
[+] labster|2 years ago|reply
The cut that hurt the deepest for me is when negated search terms stopped working. You used to be able to get subsets of data, finding things without a homonym, but now the algorithm just gives you the most common results and the most paid ads.
[+] netrap|2 years ago|reply
I am starting to wonder more if Google search died or if the web did. You read a lot of comments about how in the past you could find a lot more information, but the reality is that no one is posting that information outside of the walled gardens. It's no wonder Google search sucks, if they don't have direct access to these walled gardens the information is locked behind the door...
[+] peepeepoopoo14|2 years ago|reply
They need to stop trying to so heavily curate what users are allowed to see in their searches. Since 2020, Google's attempts to operate as the internet's ideological censor have had a noticeable negative impact on the quality of their search results.
[+] hospitalJail|2 years ago|reply
This post doesnt seem frontpage worthy, but I'll join in on the google bashing whenever I have a chance.

The other day I tried to google search 4chan, imgur and reddit came up before 4chan. heck 4chan wasnt even on the first page...

On a similar note, they are trying their best to hide wikipedia, now I have to specifically mention wiki most of the time.

I have the best website for a specific thing, I do well in quite a few SEO searches. However, the generic term, (which I'm still the best in), I am nowhere to be found. Blogspam and lower quality advice are all over the first 2 pages. Heck, there is actually terrible advice in the first 2 pages, like dis/misinformation.

The other day I was searching something I knew existed, I knew the page, and google would not give me the page. Ended up typing in the website.

Whatever is going on at google is in the major red flag territory. We need to perma fork android, we need to get off chromimum, find an alternative to Pixel. We are near the end times of google, they are going to be AOL taking advantage of old people and those who refuse to change their ways.

[+] d--b|2 years ago|reply
What changed?

Just an asumption here. But could it be the fault of the web rather than Google's?

Could it be that the web itself is being so flooded with SEO-ed crap content that even Google can't sort through it all?

[+] stevage|2 years ago|reply
I would really love to read a deep dive on why Google Search quality has declined so much. There's a lot of anecdotes but I haven't read much that sounds particularly authoritative.
[+] cma|2 years ago|reply
In the TPUv4 paper it sounded like Google search moved over to some really crude AI embedding vector stuff. If you search "book of master system" it gives only results for "book of genesis" because it mixes up the biblical book of Genesis with the Sega Master System.

Searching any programming terms, like camel-case variable names from a system library without exact quotes, just gives random results about Katy Perry and stuff now.

[+] oneshtein|2 years ago|reply
Google search results are driven by clicks. There are much more non-programmers than programmers, so they pop non-programming topics to top of the list. Average user don't care about programming languages. Google is a general search engine, not a specialized one. There was attempt to spy on users and split them into different buckets, but it's dangerous to population, because a government can just ask Google for the list of «wrong-thinkers».
[+] supriyo-biswas|2 years ago|reply
I looked for “book of master system” and it gave me results about the SEGA book.

Granted, vectorization could be one of the issues plaguing search, but it isn’t the primary issue.

[+] nottorp|2 years ago|reply
Is the article's author worrying about Google specifically or search generally?

Because Google specifically has been degrading from self inflicted cuts for years.

Agreed that reddit going dark and machine generated "content" will only make it worse, but perhaps he should have talked about search generally then?