top | item 44975533

(no title)

spinningslate | 6 months ago

It was never intended to be "enforced":

> The standard, developed in 1994, relies on voluntary compliance [0]

It was conceived in a world with an expectation of collectively respectful behaviour: specifically that search crawlers could swamp "average Joe's" site but shouldn't.

We're in a different world now but companies still have a choice. Some do still respect it... and then there's Meta, OpenAI and such. Communities only work when people are willing to respect community rules, not have compliance imposed on them.

It then becomes an arms race: a reasonable response from average Joe is "well, OK, I'll allows anyone but [Meta|OpenAI|...] to access my site. Fine in theory, dificult in practice:

1. Block IP addresses for the offending bots --> bots run from obfuscated addresses

2. Block the bot user agent --> bots lie about UA.

...and so on.

[0]: https://en.wikipedia.org/wiki/Robots.txt

discuss

order

majkinetor|6 months ago

Thanks for the info. However people seem to think that robots.txt will protect them while it was created for another world as you nicelly stated. I guess Nepenthes like tools will be more common in the future, now that tragedy of commons entered digital domain.