(no title)
trebor | 1 year ago
If AI crawlers want access they can either behave, or pay. The consequence will almost universal blocks otherwise!
trebor | 1 year ago
If AI crawlers want access they can either behave, or pay. The consequence will almost universal blocks otherwise!
mschuster91|1 year ago
Not sure how to implement it in the cloud though, never had the need for that there yet.
[1] https://gist.github.com/flaviovs/103a0dbf62c67ff371ff75fc62f...
jks|1 year ago
Their site is down at the moment, but luckily they haven't stopped Wayback Machine from crawling it: https://web.archive.org/web/20250117030633/https://zadzmo.or...
bwfan123|1 year ago
idlewords|1 year ago
seethenerdz|1 year ago
herpdyderp|1 year ago
How? The difficulty of doing that is the problem, isn't it? (Otherwise we'd just be doing that already.)
ADeerAppeared|1 year ago
Not quite what the original commenter meant but: WE ARE.
A major consequence of this reckless AI scraping is that it turbocharged the move away from the web and into closed ecosystems like Discord. Away from the prying eyes of most AI scrapers ... and the search engine indexes that made the internet so useful as an information resource.
Lots of old websites & forums are going offline as their hosts either cannot cope with the load or send a sizeable bill to the webmaster who then pulls the plug.
gundmc|1 year ago
unsnap_biceps|1 year ago
That counts as barely imho.
I found this out after OpenAI was decimating my site and ignoring the wildcard deny all. I had to add entires specifically for their three bots to get them to stop.
LukeShu|1 year ago
And how often does it check robots.txt? ClaudeBot will make hundreds of thousands of requests before it re-checks robots.txt to see that you asked it to please stop DDoSing you.
Animats|1 year ago
ksec|1 year ago
Or they could at least have the curtesy to scrap during night time / off peak hours.
jsheard|1 year ago
tredre3|1 year ago
seethenerdz|1 year ago
Vampiero|1 year ago
Who cares? They've already scraped the content by then.
jsheard|1 year ago
_heimdall|1 year ago
seethenerdz|1 year ago
emmelaich|1 year ago
seethenerdz|1 year ago