Surely people can relate to the situation where you end up on an article based on some technical query you have. The article repeats your question 7 times, has endless casually-related filler text that still does not answer the question and then ends with: try to unplug it.
It is so freaking obvious that it's a malicious content farm, but Google with all of its technical might seem unable or unwilling to detect it. If tech can't do it, organize some type of curation or feedback?
Same for image search. You search for "red flower Thailand" and flowers of various other colors from various locations appear. The idea that Google is spectacularly good at subject detection from imagery does not seem to actually work out in practice.
Most people's search queries consist of just 2-3 words. Nowadays Google consistently just drops the last word as if it knows better than I do what I need.
High value elaborate articles on various topics do not rank. Instead, dated articles do. You have to manually bookmark high quality content as you see it, because you'll never find it back via search.
Is everybody asleep at Google? This is not a small thing, this is your bread and butter. Teens are using Tiktok for search, you're in real trouble and better start cleaning up your act.
> Same for image search. You search for "red flower Thailand" and flowers of various other colors from various locations appear.
Now, if any of these flowers are next to a red dress, tapping the dress will reveal links to places you can buy it.
Google is not asleep. It has just got its priorities wrong. (Or rather, incentives in this organization seem to reward not what users like me appreciate.)
The thing is, all the stuff you’re listing as things they suck at properly detecting, filtering, categorizing, they used to be extremely good at. The google search that exists today is markedly worse than the Google search of 5 years ago. Wtf happened to cause search to just rot away into a useless mess of results that used to be very high quality? It’s such a night and day regression that I’ve legit wondered if this is a sign of the Mandela effect.
As an example of how bad the searchability is nowadays, I’ve been creating and expanding my own knowledge base (something like a personal wiki with links to interesting content I find) for about a year. It seems to work very well despite the effort it takes to keep it organized.
Honest question, how much of this is simply due to Google slowing showing more & more ads on Page 1 of search results?
There use to be a time when paid placement was only 1-2 results.
It’s frequent now that the top 5-6 results are paid placement.
(And when I’m doing a search for a specific product I know I want, competitors are bidding up those search terms which is annoying because I’m being shown not what I’m explicitly searching for)
Kagi just renders those toxic sites into a grouping called "Listicles" which I then ignore. It's far from perfect, but it's clear that a company with far less money and access than Google doesn't find this an impossible problem to address.
So I would suggest that Google knows what it's doing, it just makes them money.
It's because Google has never been focussed on search quality imo. No search engine produces high quality results anymore.
Especially since you get the clear spam sites that somehow reference your query in the page content (where they've just spammed loads of keywords, but also pretty sure some spam sites are doing something dynamic with it).
Google has maximized advertisement $ and that's all.
We're all technically minded here but very few people really understand how technical choices add up to greater detriments.
and that's today's Google. they minimized the index and maximized the searches that yield profit through Google ads. those websites you hate? they monetize Google ad words.
the one that pisses me off most is installing/configuring a software package... Top articles always end up being "apt-get install foo" and never address the configuration at all.
> We find that only a small portion of product reviews on the web uses affiliate marketing, but the majority of all search results do. [...] We further observe an
inverse relationship between affiliate marketing use and content complexity, and that all search engines fall victim to large-scale affiliate link spam
campaigns.
I think this is an excellent methodology for testing the quality of search results. I would love to see a standard search engine test and scoring system based on this, maybe similar to some of the LLM scoring systems.
This doesn't apply to the content complexity finding, but the finding that "product reviews which are in top search results are more likely to contain affiliate links than product reviews which are not" can also be explained by the fact that if and only I am getting a bunch of hits on my product reviews, I'm incentivized to monetize that with affiliate links.
Forgive my naivety, but wouldn't a simple way for a search engine (like Kagi) to avoid falling victim here to detect affiliate link programs? There's got to be a small handful of patterns for affiliate link tracking:
1. Domain Interception & HTTP redirects
2. Tracking codes embedded in the URL directly
this is one way to do it, but I wouldn't say its sufficient. If I search for 'things to do in Seattle', you get many 'blogs' and such that a writer gets paid by sources to insert their place into the things to do list. I didn't word that well, so for example: I own a coffee shop, I pay them moneys, and the '25 things to do in Seattle' writer puts my coffee shop in the list.
If I do an image search for the word 'strawberry', how many of those results are not stock images, images from a store, etc. of a strawberry? can you find an actual picture of a strawberry sitting in the wild? or just some picture of a strawberry a person uploaded without trying to sell you something?
I don't know, feels like a paper titled "Is Google Getting Worse" could have benefited from actually looking at Google results rather than only results of other search engines.
Edit: This got downvoted to hell, so let me be more explicit. This study did not look at Google results, the title is pure clickbait. They used Startpage results as a proxy for Google results. I don't think that's a valid assumption, even if Startpage is using Google's index.
Yes. Google used to be amazing, then it turned into an advertisement company. Slowly at first, then about a decade ago the pace increased.
But the worst part is, Google SEO has infected the entire web and made it into complete garbage.
Hopefully, this last decade or so will just be a blip before we return to baseline, where it can be wild and free again.
Are there stats* of Google Search across the years? I felt I don't use Google as much as I used to. And it isn't because "I know more stuff", but mainly because the way we use internet has changed. I wonder if kids or teens (who most of them don't know how to use an email inbox) would use Google... (I guess yeah?)
* Of course, the stats should include the total amount of internet users globally, or normalize the amount of searches based on that...
I've been in web development and SEO for almost 20 years now.
When I first started out all the veterans of SEO kept telling me not to do this, don't do that with things that could get your site buried in the SERP's. At the time Google's algorithm was really good at ferreting out affiliate links, link farms and other nefarious black hat techniques SEO's used to game Google.
Now? Complete opposite. I have several freelance clients and I've used every dirty SEO trick in the book and all of them have worked like magic to get my clients sites ranked on page 1 or 2 of the SERP's.
I have no idea what changed, but Google is super easy to manipulate now to get your site or specific pages ranking really high. I haven't heard or seen any of the horror stories I read and people blogged about constantly when I first started out for years - which tells me they're all probably doing the same thing I am and not seeing any repercussions.
Maybe Google doesn't care because users have become so savvy, they can filter through a ton of garbage in minutes to find what they really want?
Definitely I think the way we use the internet has changed profoundly. There are a lot of apps that provide useful information but they may not be made indexable by search engines. Much less useful information is simply out there on the open web, and much of it are locked behind logins. There were previous deals like Twitter sending a completely copy of all new tweets to Google, but these are basically dying.
It's especially interesting since you mentioned normalizing searches by the number of internet users. The country with the largest number of internet users is China, with more than 1 billion of them. And they don't have access to Google. And their local copycat, Baidu, is years behind Google in terms of technological sophistication and simultaneously years ahead of Google in terms of user hostility. So what do Internet users in China do in a post-search world? They simply open various apps and use the full text search feature of different apps. For general knowledge they might open ZhiHu and search there; for something resembling the old-time personal blogs by individual users they might open XiaoHongShu and search there; for short videos they might open Douyin and for long ones Bilibili. For reaching an organization be it a store or a museum or a hospital or a government department they might open WeChat and search there for an official account or mini program (a mini program is a website that uses WeChat APIs and can only be opened in WeChat).
I made these observations on a recent trip to China and it's clear to me what a post-search world looks like because China is already there.
Long-term stats are tricky because of how much of the landscape has changed. There's the desktop/mobile split, developing countries increasing their internet use, heavier use of apps, growth and decline of results getting indexed, and change in what we search for.
It's not worse when you append site:reddit.com to every single search but this is only a function of the fact that reddit can't figure out how to build their own search. Outside of maybe programming stuff where I'll still click on links, I don't think google has driven me organically to a new site in years.
> It's not worse when you append site:reddit.com to every single search
Could you give some examples of search queries that would benefit from filtering by reddit?
(My own example: I've been looking for recommendations for a solid Linux laptop. A good result would be a list of reviews written from personal experience of owning such laptops. Reddit was useless for that.)
Quite likely underreporting affiliate links due to obfuscation like cloaking, hiding redirects behind javascript (they mention in the paper not rendering the page), using JS and a POST, other URL minifiers etc.
One interesting solution to the problem is to have more than one dominant search engine and its algorithmic choices, having half a dozen web-scale engines with some variation at least gives the user a choice into other avenues of information discovery. (There isn't really much point in using Startpage and DDG here since they're effectively meta search engines of Google and Bing). For SEOs in English speaking countries there is not much point in thinking beyond pleasing Google.
Clearly AI and whack-a-mole spam sites have been a problem for a while due to the prevalance of people tacking on 'reddit' to their query to find other humans talking about stuff.
It seems that the answer from their study is mostly no (reading the conclusion), but they seem reluctant to admit it, so they focus mostly on results being mediocre and spammy.
I visited a small country in my last vacation. I ended up bringing a bit of money back because I was in a hurry I thought it would be easy to exchange back at home, even at worse rates. I live in a big city after all. Of course, I was very wrong.
I spent an afternoon Googling every possible incantation only to get useless AI generated text, travel agency sites or simply unrelated content.
I was about to accept my loss when I tried Kagi. The first page showed an exchange that accepted the currency. Very far from me and with terrible rates, but still.
Anecdata and all, but the fact is that I'm using Kagi more and more and it's winning my trust and good will fast.
As an aside, many full service banks can exchange foreign currency for you. For example, Bank of America does (https://www.bankofamerica.com/foreign-exchange/exchange-rate...). You can also order foreign currency prior to your trip and they'll mail it to you.
Google search for topics I'm unfamiliar with/wanting to learn about all lead to low quality, SEO-optimized to hell, clik-baity sites that are just riddled with ads. I have to add "reddit" to most searches just to find semi-relevant content.
But Google search for topics i'm super familiar with and just need a transactional search to look something up tend to be much better and generally the fastest way to accomplish a task.
Right, adding "discussion", "forum" and even "reddit" to the search term increases quality dramatically. Also, Google is still great to search websites. I gave up on Stackoverflow's own search but add "site:stackoverflow.com" to my searches on google.
BTW, maybe someone wants to create a very simple webpage with a search mask that allows adding a few (customizable) terms and options and simply forwards that to Google's search when pressing enter.
I've started using uBlockList to block those sites and it does clean it up quite a bit. I assume those sites also work for some, but they're usually of low quality to me so I block them.
The second type of searches you describe seem to be better so far, but I've stumbled upon a bunch of obviously generated garbage recently. So not sure how long it'll hold.
I've tried things like DDG, YaCy, Bing and others, but often Google is just significantly better (but not necessarily good).
I didn't know this exists, thanks. Every time I google anything programming-related looking for the docs, and the official docs are buried under a pile of SEO shit, I die inside a little.
This is particularly egregious with Python, and I suppose it must be just as bad or worse in the JS ecosystem.
I wonder if "affiliate links" is a reasonable proxy for page quality? I wouldn't use it directly in a ranking function but it might be a nice automated way to estimate whether results are good.
In a good way or bad way? Affiliate links might bias reviews, but their presence might mean the review was high enough quality that enough people see it and click on affiliate links.
I don't know if the overall search quality has degraded or not, but SEO has been become definitely a much more severe issue than ever before. Google is the main target for this attack for obvious reasons but no other search engines are really immune to this. I'm skeptical if this can be tackled by any technical solutions; the problem is not just a specific type of SEO spamming but the structure where the enemies are constantly optimizing against your fundamental goal.
Google is better than Bing and anything else for my day today work. Few things I miss though :
1. Spam is more than in past. The outrage porn, clickbait headlines etc. are lot more than in past.
2. Dominance of few domains despite poor quality content. For lot of coding related queries, dev.to, hashnode etc. appear in top results despite being clearly spammy.
3. Paywalled content. Most irritating part is sites like medium which appear in top results, have high value content and yet are behind paywall.
Internet is growing and so are Google's problems but I think they are still on top of things.
This paper does not make sense to me. They used Startpage (https://www.startpage.com) as a proxy for Google results ("result pages of Google (by proxy of Startpage)" in a discussion of Google results?
Except, if you put the same search in Startpage and Google, you get different results. Image results especially are quite different. Text results were mostly just a reorganization on my quick tests. (Tried the title of the paper as its own search "Is Google Getting Worse? A Longitudinal Investigation of SEO Spam in Search")
Edit: One other notable result is from Figure 3, that a huge percentage of the results now are Amazon and Youtube. Many orders of magnitude in most cases. Amazon (3000-4000), NYTimes (1000), Walmart (~500?), Insider/PCMag/Tomsguide (~50?)
I knew I wasn't the only one thinking this lol - glad it's proven by an investigation
How does one think Google is planning to change it's ranking system?
[+] [-] iteratethis|2 years ago|reply
Surely people can relate to the situation where you end up on an article based on some technical query you have. The article repeats your question 7 times, has endless casually-related filler text that still does not answer the question and then ends with: try to unplug it.
It is so freaking obvious that it's a malicious content farm, but Google with all of its technical might seem unable or unwilling to detect it. If tech can't do it, organize some type of curation or feedback?
Same for image search. You search for "red flower Thailand" and flowers of various other colors from various locations appear. The idea that Google is spectacularly good at subject detection from imagery does not seem to actually work out in practice.
Most people's search queries consist of just 2-3 words. Nowadays Google consistently just drops the last word as if it knows better than I do what I need.
High value elaborate articles on various topics do not rank. Instead, dated articles do. You have to manually bookmark high quality content as you see it, because you'll never find it back via search.
Is everybody asleep at Google? This is not a small thing, this is your bread and butter. Teens are using Tiktok for search, you're in real trouble and better start cleaning up your act.
[+] [-] dr_kiszonka|2 years ago|reply
Now, if any of these flowers are next to a red dress, tapping the dress will reveal links to places you can buy it.
Google is not asleep. It has just got its priorities wrong. (Or rather, incentives in this organization seem to reward not what users like me appreciate.)
[+] [-] wutwutwat|2 years ago|reply
[+] [-] balder1991|2 years ago|reply
[+] [-] alberth|2 years ago|reply
There use to be a time when paid placement was only 1-2 results.
It’s frequent now that the top 5-6 results are paid placement.
(And when I’m doing a search for a specific product I know I want, competitors are bidding up those search terms which is annoying because I’m being shown not what I’m explicitly searching for)
[+] [-] EA-3167|2 years ago|reply
So I would suggest that Google knows what it's doing, it just makes them money.
[+] [-] tomcam|2 years ago|reply
[+] [-] urbandw311er|2 years ago|reply
[+] [-] fennecfoxy|2 years ago|reply
Especially since you get the clear spam sites that somehow reference your query in the page content (where they've just spammed loads of keywords, but also pretty sure some spam sites are doing something dynamic with it).
[+] [-] yonatan8070|2 years ago|reply
[+] [-] cyanydeez|2 years ago|reply
We're all technically minded here but very few people really understand how technical choices add up to greater detriments.
and that's today's Google. they minimized the index and maximized the searches that yield profit through Google ads. those websites you hate? they monetize Google ad words.
[+] [-] jay_kyburz|2 years ago|reply
https://imgur.com/a/sJCECzQ
Looks mostly red to me. A little pink too I guess.
[+] [-] mech422|2 years ago|reply
[+] [-] QuantumGood|2 years ago|reply
[+] [-] aaron695|2 years ago|reply
[deleted]
[+] [-] pants2|2 years ago|reply
I think this is an excellent methodology for testing the quality of search results. I would love to see a standard search engine test and scoring system based on this, maybe similar to some of the LLM scoring systems.
[+] [-] oakashes|2 years ago|reply
[+] [-] ryanisnan|2 years ago|reply
1. Domain Interception & HTTP redirects 2. Tracking codes embedded in the URL directly
[+] [-] autokad|2 years ago|reply
If I do an image search for the word 'strawberry', how many of those results are not stock images, images from a store, etc. of a strawberry? can you find an actual picture of a strawberry sitting in the wild? or just some picture of a strawberry a person uploaded without trying to sell you something?
[+] [-] gumby|2 years ago|reply
[+] [-] jsnell|2 years ago|reply
Edit: This got downvoted to hell, so let me be more explicit. This study did not look at Google results, the title is pure clickbait. They used Startpage results as a proxy for Google results. I don't think that's a valid assumption, even if Startpage is using Google's index.
[+] [-] Gud|2 years ago|reply
But the worst part is, Google SEO has infected the entire web and made it into complete garbage. Hopefully, this last decade or so will just be a blip before we return to baseline, where it can be wild and free again.
[+] [-] 101008|2 years ago|reply
* Of course, the stats should include the total amount of internet users globally, or normalize the amount of searches based on that...
[+] [-] at-fates-hands|2 years ago|reply
When I first started out all the veterans of SEO kept telling me not to do this, don't do that with things that could get your site buried in the SERP's. At the time Google's algorithm was really good at ferreting out affiliate links, link farms and other nefarious black hat techniques SEO's used to game Google.
Now? Complete opposite. I have several freelance clients and I've used every dirty SEO trick in the book and all of them have worked like magic to get my clients sites ranked on page 1 or 2 of the SERP's.
I have no idea what changed, but Google is super easy to manipulate now to get your site or specific pages ranking really high. I haven't heard or seen any of the horror stories I read and people blogged about constantly when I first started out for years - which tells me they're all probably doing the same thing I am and not seeing any repercussions.
Maybe Google doesn't care because users have become so savvy, they can filter through a ton of garbage in minutes to find what they really want?
[+] [-] kccqzy|2 years ago|reply
It's especially interesting since you mentioned normalizing searches by the number of internet users. The country with the largest number of internet users is China, with more than 1 billion of them. And they don't have access to Google. And their local copycat, Baidu, is years behind Google in terms of technological sophistication and simultaneously years ahead of Google in terms of user hostility. So what do Internet users in China do in a post-search world? They simply open various apps and use the full text search feature of different apps. For general knowledge they might open ZhiHu and search there; for something resembling the old-time personal blogs by individual users they might open XiaoHongShu and search there; for short videos they might open Douyin and for long ones Bilibili. For reaching an organization be it a store or a museum or a hospital or a government department they might open WeChat and search there for an official account or mini program (a mini program is a website that uses WeChat APIs and can only be opened in WeChat).
I made these observations on a recent trip to China and it's clear to me what a post-search world looks like because China is already there.
[+] [-] Axien|2 years ago|reply
Quick: If you want to search on what is the best toaster to buy, what URL do you type?
How do you find out the weather?
How do you find a local brewery in your area?
For me the first answer is Amazon. And the second is ask Siri. The third is Apple Maps.
Google is now below 50% of my search terms. Heck, I’m more likely to search Reddit for some queries.
[+] [-] dehrmann|2 years ago|reply
[+] [-] charlotte-fyi|2 years ago|reply
[+] [-] azangru|2 years ago|reply
Could you give some examples of search queries that would benefit from filtering by reddit?
(My own example: I've been looking for recommendations for a solid Linux laptop. A good result would be a list of reviews written from personal experience of owning such laptops. Reddit was useless for that.)
[+] [-] barbazoo|2 years ago|reply
[0] https://blog.kagi.com/kagi-features
[+] [-] ricardo81|2 years ago|reply
One interesting solution to the problem is to have more than one dominant search engine and its algorithmic choices, having half a dozen web-scale engines with some variation at least gives the user a choice into other avenues of information discovery. (There isn't really much point in using Startpage and DDG here since they're effectively meta search engines of Google and Bing). For SEOs in English speaking countries there is not much point in thinking beyond pleasing Google.
Clearly AI and whack-a-mole spam sites have been a problem for a while due to the prevalance of people tacking on 'reddit' to their query to find other humans talking about stuff.
[+] [-] gniv|2 years ago|reply
[+] [-] tambourine_man|2 years ago|reply
I spent an afternoon Googling every possible incantation only to get useless AI generated text, travel agency sites or simply unrelated content.
I was about to accept my loss when I tried Kagi. The first page showed an exchange that accepted the currency. Very far from me and with terrible rates, but still.
Anecdata and all, but the fact is that I'm using Kagi more and more and it's winning my trust and good will fast.
[+] [-] withzombies|2 years ago|reply
[+] [-] odysseus|2 years ago|reply
[+] [-] mike_hock|2 years ago|reply
[+] [-] ado__dev|2 years ago|reply
Google search for topics I'm unfamiliar with/wanting to learn about all lead to low quality, SEO-optimized to hell, clik-baity sites that are just riddled with ads. I have to add "reddit" to most searches just to find semi-relevant content.
But Google search for topics i'm super familiar with and just need a transactional search to look something up tend to be much better and generally the fastest way to accomplish a task.
[+] [-] jansan|2 years ago|reply
BTW, maybe someone wants to create a very simple webpage with a search mask that allows adding a few (customizable) terms and options and simply forwards that to Google's search when pressing enter.
[+] [-] Avamander|2 years ago|reply
The second type of searches you describe seem to be better so far, but I've stumbled upon a bunch of obviously generated garbage recently. So not sure how long it'll hold.
I've tried things like DDG, YaCy, Bing and others, but often Google is just significantly better (but not necessarily good).
[+] [-] fuzztester|2 years ago|reply
In longitude, latitude, and by many other measures :)
Plenty of earlier posts and comments about that on HN, for many years now. What's so surprising or new about that, then?
As the saying goes, it's news if a man bites a dog, but not the other way around - doggone it if I know why, man ...
[+] [-] delta_p_delta_x|2 years ago|reply
[1]: https://github.com/iorate/ublacklist
[+] [-] agubelu|2 years ago|reply
This is particularly egregious with Python, and I suppose it must be just as bad or worse in the JS ecosystem.
[+] [-] guhcampos|2 years ago|reply
[+] [-] NelsonMinar|2 years ago|reply
[+] [-] dehrmann|2 years ago|reply
[+] [-] summerlight|2 years ago|reply
[+] [-] KorematsuFredt|2 years ago|reply
1. Spam is more than in past. The outrage porn, clickbait headlines etc. are lot more than in past.
2. Dominance of few domains despite poor quality content. For lot of coding related queries, dev.to, hashnode etc. appear in top results despite being clearly spammy.
3. Paywalled content. Most irritating part is sites like medium which appear in top results, have high value content and yet are behind paywall.
Internet is growing and so are Google's problems but I think they are still on top of things.
[+] [-] araes|2 years ago|reply
Except, if you put the same search in Startpage and Google, you get different results. Image results especially are quite different. Text results were mostly just a reorganization on my quick tests. (Tried the title of the paper as its own search "Is Google Getting Worse? A Longitudinal Investigation of SEO Spam in Search")
Edit: One other notable result is from Figure 3, that a huge percentage of the results now are Amazon and Youtube. Many orders of magnitude in most cases. Amazon (3000-4000), NYTimes (1000), Walmart (~500?), Insider/PCMag/Tomsguide (~50?)
[+] [-] frantic2821|2 years ago|reply
[+] [-] jay_kyburz|2 years ago|reply
Are you allowed to search the internet with an ad blocker installed?
Do you use a special search interface that doesn't return results with ads?