I have exactly this problem. Beej's Guide to Network Programming is indexed just fine. Beej's Guide to C won't index.
The automated tool says it's in violation of some unnamed rule, but I can't figure out which. There's zero SEO, tracking, or ads, and the content is educational and G-rated.
All the other guides index just fine.
I asked for a review and they came back with the same ambiguous message. Eventually I just gave up.
Recently I split the C guide in two. I'll have to check to see if that made any difference.
But it left a bad taste, and now I don't trust Bing or DDG to provide complete results. Google's overrun with spam, but at least my stuff actually shows up on Startpage.
Wow it's beej. I owe you rather a lot of beers, Guide to Network Programming is directly responsible for my entire career. Sorry for the low value post! :}
I may have found the answer, and I've seen this before (it happened to me once). It's when a different spam site copies your content wholesale, and a search engine decides they're the "original" site, and you're the spammy copycat.
Because if you put the headlines (in quotes) from two of his recent articles into Bring, e.g. either "Megan Smith explaining the General Magic prototyping process" or "Denialists, Alarmists, and Doomists", both point as their first result to a URL starting with "https://www.scien.cx" which seems to be the spam site with a copy of each article. (The URL isn't loading right now, however, when I try to visit.)
How to fix it really depends on what techniques they're using to mirror your site, of which there are many.
It feels like the Internet is a more hostile place than ever for small-time websites. You get squeezed from below by wily criminals, and crushed from above by careless megacorps who want to filter out anything that doesn't make them money.
Is there also a term for being algorithmically suppressed on a social media platform, I wonder? I.e. a much more subtle, harder to dectect mechanism whereby the algorithms ensure you get some exposure, but never the same as other unsuppressed people would get based on similar activity. Or only exposure to a certain limited subset of the graph based on some metrics (e.g. just your 'friends' so no one points out you are effectively shadow banned).
That is what is happening, you think you are not showing up because bad SEO or better results. You have to find out through experimentatiom that you are restricted. The moderator didn't let you know that they have taken punitive action against you.
Indeed. Some people seem to be trying to make it apply any time the top result of an algorithmic ranking isn't what they think it should be (eg if they have been deboosted rather than shadow banned).
I suspect OP used it in the way that “I believe all my posts are showing up in the places that they should, but unbeknownst to me, they are being suppressed.” In this instance, I could see a “search engine shadowban” being an appropriate moniker.
The author is extending the concept to include an inadverted ban. Why would Bing warn him anyways, since there is no user account? Welcome to Cancelbannia.
This triggered me to DuckDuckGo my own site and immediately I notice the top result is someone rehosting my OSS on a page loaded with pages of crap SEO content.
>One “out there” reason I can think is that I use Amazon Affiliate links on my Bookshelf and my /Uses page and that triggers a shadow ban?
It's probably not the reason, but it's worth noting that the author is using Amazon affiliate links in violation of Amazon and FTC rules because they're not disclosing the fact that they profit from purchases through their links.
Per Amazon:
>Anytime you share an affiliate link, it's important to disclose that to your audience... you must (1) include a legally compliant disclosure with your links and (2) identify yourself on your Site as an Amazon Associate with the language required by the Operating Agreement.
>As for where to place a disclosure, the guiding principle is that it has to be clear and conspicuous... Consumers should be able to notice the disclosure easily. They shouldn’t have to hunt for it.
For some reason, Beej’s Guide to C Programming is also banned from Bing (and consequently DDG) [1], with the standard robotic non-explanations given when the author asked, even though the rest of the site is not.
You've indicated that you've used Bing's tools to see if your website has been indexed but are silent as to whether you've actually manually submitted your site to be indexed by Bing using their url submission tool [0]. If you do submit the URL and then, after a decent interval, your site still doesn't show up then there might be something to your claim.
First, he sends it with a "content-type: application/xml" header. In contrast to most sites that send it with "content-type: application/atom+xml". Which seems to have the nice effect that it renders in Firefox instead of opening the usual "What should Firefox do with this file?" popup.
Secondly, he provides this nice header text "Yahaha, you found me! This is my RSS feed.". It seems to be fetched via this part of the code:
Pretty nice. Are those best practices? Or will "content-type: application/xml" mess with users who have a native feed reader installed and expect the reader to kick in when they click on a feed url?
XSLT is a template language where you match snippets of XML using XPath expressions and output... anything else, in this case HTML using templates that can make use of the attributes, inner content, etc. of the captured XML portion.
You can submit a ticket to bing via their webmaster tools website. I've done it before in the past and a real human did respond at the time. In my experience bing will straight deindex full websites for unknown reasons while Google will ranking penalize but leave you searchable if their algo feels you deserve it.
Had the same problem due to negative SEO campaigns by naughty competitors. Wrote to Bing Webmaster Tools Support team (https://www.bing.com/webmasters/help/webmaster-support-24ab5...) and after a lengthy process got a response that ”the issue” had been addressed.
It’s been a few months since and my website is indeed back in the search results so I advice whoever is having this problem to reach out to Bing.
It would be quite surprising if reverse records were the reason. The vast majority of sites certainly don't have them pointing back. Impossible to do with many providers and probably basically all CDNs.
Negative SEO scammers use intentional search-policy violations to push down the rank of perceived competitors for a few weeks.
While it is more likely the poster will get a few people to check on the situation and naively drive up page rank... a personal site is just a rounding error for traffic in a long-tail distribution known as the modern web.
Most search engines will correlate user-side telemetry traffic against crawler and web stats. i.e. if the bots tend to prefer your site for abnormal reasons, the ranking algorithm may blacklist a signature, domain, and IP sets for several weeks as punishment.
Note too, it is still common for a human employee to manually check a suddenly popular site that pops up out of obscurity. i.e. this catches the more sophisticated cheats, and may have legal repercussions in severe cases.
In summary, if you mess with modern search engines, than expect the ban hammer to fall eventually. ;)
At the bottom of the page there is a comment that says "Some results have been removed", but unlike Google you can't see them. Would be interesting to know if the domain is among the removed results or if the site has not just been indexed yet.
HN taught me first hand on how horrible shadow banning is. You all tolerate this here so it's mighty hypocritical of you to criticize Bing.
I'll say it again: It comes down to how you treat people. Treat others the way you want to be treated. No one wants to be shadowbanned and we can all agree it is a decidedly cowardly and cruel thing to do.
And you can't use "quality" or anything short of being coerced as an excuse. Techniques and technologies to moderate people without shadowmodding at scale are mot just there but very well established. A site for technologists has no excuse to shadowmod other than elitism amorality.
HN's "shadow bans" aren't hidden and allow you to keep posting. Isn't that more tolerant than most other forms of banning?
I have showdead turned on. I see fresh accounts that are automatically dead on each comment which are legit contributions, probably because they're using Tor or a widely abused VPN; that's the only common miscarriage of HN moderation I regularly see, and I vouch for house comments. Those accounts should be in the clear after a week or something like that. I have a couple comments I feel shouldn't be dead, but I can see how others would feel differently, and I have I believe 3 dead comments out of >3000 (many of which expressed views others vocally disagreed with, and I generally feel my views are not particularly popular on HN). But most of the dead comments I see are obviously harmful to discussion. The last time I saw hate speech from a banned HN account - was earlier today. What is it I'm missing here?
It's all well and good to say, treat others as you'd like to be treated. But I don't want to be harassed either. So I forgo harassing people sure. But what's to be done about the people harassing me?
Are you perhaps unaware of a phenomenon called the paradox of tolerance where, if you extend universal tolerance to everyone, including those who use their speech to silence others (through threats, harassment, shouting over people, poisoning the well, etc), you still end up with a forum in which not everyone can share their ideas?
[+] [-] beej71|3 years ago|reply
The automated tool says it's in violation of some unnamed rule, but I can't figure out which. There's zero SEO, tracking, or ads, and the content is educational and G-rated.
All the other guides index just fine.
I asked for a review and they came back with the same ambiguous message. Eventually I just gave up.
Recently I split the C guide in two. I'll have to check to see if that made any difference.
But it left a bad taste, and now I don't trust Bing or DDG to provide complete results. Google's overrun with spam, but at least my stuff actually shows up on Startpage.
[+] [-] KRAKRISMOTT|3 years ago|reply
[+] [-] et-al|3 years ago|reply
[+] [-] cyberpunk|3 years ago|reply
[+] [-] mprime1|3 years ago|reply
(Yes, not adding much insightful conversation. I don’t care if I get downvoted.)
[+] [-] unknown|3 years ago|reply
[deleted]
[+] [-] richardjam73|3 years ago|reply
https://duckduckgo.com/?q=c+guide+stdalign&t=ffab&ia=web
brings up your guide as the 6th result.
[+] [-] daflip|3 years ago|reply
[+] [-] crazygringo|3 years ago|reply
Because if you put the headlines (in quotes) from two of his recent articles into Bring, e.g. either "Megan Smith explaining the General Magic prototyping process" or "Denialists, Alarmists, and Doomists", both point as their first result to a URL starting with "https://www.scien.cx" which seems to be the spam site with a copy of each article. (The URL isn't loading right now, however, when I try to visit.)
How to fix it really depends on what techniques they're using to mirror your site, of which there are many.
Example search and resulting URL:
https://www.bing.com/search?q=%22Megan+Smith+explaining+the+...
https://www.scien.cx/2022/12/25/megan-smith-explaining-the-g...
Compare with Google getting it right:
https://www.google.com/search?q=%22Megan+Smith+explaining+th...
https://daverupert.com/2022/12/megan-smith-general-magic-pro...
[+] [-] gary_0|3 years ago|reply
It feels like the Internet is a more hostile place than ever for small-time websites. You get squeezed from below by wily criminals, and crushed from above by careless megacorps who want to filter out anything that doesn't make them money.
[+] [-] lapcat|3 years ago|reply
[+] [-] TrueGeek|3 years ago|reply
[+] [-] 0cf8612b2e1e|3 years ago|reply
[+] [-] supermatt|3 years ago|reply
[+] [-] tasuki|3 years ago|reply
[+] [-] rapnie|3 years ago|reply
[+] [-] badrabbit|3 years ago|reply
[+] [-] seanhunter|3 years ago|reply
[+] [-] JadoJodo|3 years ago|reply
[+] [-] ffhhj|3 years ago|reply
[+] [-] Dylan16807|3 years ago|reply
[+] [-] travisgriggs|3 years ago|reply
Ghost Banned
or
Ghostdexed
[+] [-] Eleison23|3 years ago|reply
[deleted]
[+] [-] donatj|3 years ago|reply
Scrolling further, I don’t seem to find my own site either… https://donatstudios.com
I’ve added my site into Bing webmaster tools, we’ll see if it helps I guess.
[+] [-] HomeDeLaPot|3 years ago|reply
[+] [-] mtlynch|3 years ago|reply
It's probably not the reason, but it's worth noting that the author is using Amazon affiliate links in violation of Amazon and FTC rules because they're not disclosing the fact that they profit from purchases through their links.
Per Amazon:
>Anytime you share an affiliate link, it's important to disclose that to your audience... you must (1) include a legally compliant disclosure with your links and (2) identify yourself on your Site as an Amazon Associate with the language required by the Operating Agreement.
https://affiliate-program.amazon.com/help/node/topic/GHQNZAU...
Per FTC:
>As for where to place a disclosure, the guiding principle is that it has to be clear and conspicuous... Consumers should be able to notice the disclosure easily. They shouldn’t have to hunt for it.
https://www.ftc.gov/business-guidance/resources/ftcs-endorse...
[+] [-] mananaysiempre|3 years ago|reply
[1] https://beej.us/guide/bgc/whynoddg.html
[+] [-] Pelam|3 years ago|reply
[+] [-] pseudolus|3 years ago|reply
[0] https://www.bing.com/webmasters/help/url-submission-62f2860b
[+] [-] Liquix|3 years ago|reply
[+] [-] beej71|3 years ago|reply
[+] [-] lapcat|3 years ago|reply
[EDIT] I just published a new blog post "Bing and DuckDuckGo removed my business web site AGAIN" https://lapcatsoftware.com/articles/bing2.html
Sigh.
[+] [-] mg|3 years ago|reply
https://daverupert.com/atom.xml
First, he sends it with a "content-type: application/xml" header. In contrast to most sites that send it with "content-type: application/atom+xml". Which seems to have the nice effect that it renders in Firefox instead of opening the usual "What should Firefox do with this file?" popup.
Secondly, he provides this nice header text "Yahaha, you found me! This is my RSS feed.". It seems to be fetched via this part of the code:
<?xml-stylesheet href="/pretty-feed-v3.xsl" type="text/xsl"?>
Pretty nice. Are those best practices? Or will "content-type: application/xml" mess with users who have a native feed reader installed and expect the reader to kick in when they click on a feed url?
[+] [-] rzzzt|3 years ago|reply
"An elegant weapon for a more... civilized age."
[+] [-] marginalia_nu|3 years ago|reply
https://search.marginalia.nu/site/daverupert.com
[+] [-] sct202|3 years ago|reply
[+] [-] linmob|3 years ago|reply
[+] [-] henriquez|3 years ago|reply
[+] [-] ducklingquack|3 years ago|reply
It’s been a few months since and my website is indeed back in the search results so I advice whoever is having this problem to reach out to Bing.
[+] [-] vcg3rd|3 years ago|reply
It says your IP doesn't direct to your site. I wonder if that's the problem.
[+] [-] tambre|3 years ago|reply
[+] [-] Joel_Mckay|3 years ago|reply
While it is more likely the poster will get a few people to check on the situation and naively drive up page rank... a personal site is just a rounding error for traffic in a long-tail distribution known as the modern web.
Most search engines will correlate user-side telemetry traffic against crawler and web stats. i.e. if the bots tend to prefer your site for abnormal reasons, the ranking algorithm may blacklist a signature, domain, and IP sets for several weeks as punishment.
Note too, it is still common for a human employee to manually check a suddenly popular site that pops up out of obscurity. i.e. this catches the more sophisticated cheats, and may have legal repercussions in severe cases.
In summary, if you mess with modern search engines, than expect the ban hammer to fall eventually. ;)
[+] [-] TomK32|3 years ago|reply
[+] [-] omgmajk|3 years ago|reply
[+] [-] dazc|3 years ago|reply
Why, Bing does not totally ignore bad links.
The good news, all sites I've seen affected by this recover after a few weeks.
[+] [-] badrabbit|3 years ago|reply
I'll say it again: It comes down to how you treat people. Treat others the way you want to be treated. No one wants to be shadowbanned and we can all agree it is a decidedly cowardly and cruel thing to do.
And you can't use "quality" or anything short of being coerced as an excuse. Techniques and technologies to moderate people without shadowmodding at scale are mot just there but very well established. A site for technologists has no excuse to shadowmod other than elitism amorality.
[+] [-] maxbond|3 years ago|reply
I have showdead turned on. I see fresh accounts that are automatically dead on each comment which are legit contributions, probably because they're using Tor or a widely abused VPN; that's the only common miscarriage of HN moderation I regularly see, and I vouch for house comments. Those accounts should be in the clear after a week or something like that. I have a couple comments I feel shouldn't be dead, but I can see how others would feel differently, and I have I believe 3 dead comments out of >3000 (many of which expressed views others vocally disagreed with, and I generally feel my views are not particularly popular on HN). But most of the dead comments I see are obviously harmful to discussion. The last time I saw hate speech from a banned HN account - was earlier today. What is it I'm missing here?
It's all well and good to say, treat others as you'd like to be treated. But I don't want to be harassed either. So I forgo harassing people sure. But what's to be done about the people harassing me?
Are you perhaps unaware of a phenomenon called the paradox of tolerance where, if you extend universal tolerance to everyone, including those who use their speech to silence others (through threats, harassment, shouting over people, poisoning the well, etc), you still end up with a forum in which not everyone can share their ideas?
[+] [-] badrabbit|3 years ago|reply
[+] [-] RandomWorker|3 years ago|reply
Maybe change this? Simply add:
User-agent: * Disallow:
To allow all crawlers to the site.
[+] [-] unknown|3 years ago|reply
[deleted]