top | item 43448830

(no title)

yuvalr1 | 11 months ago

I am not involved in scraping, but to me this sounds like simply another tool in the arsenal. They say it's hard for the scraper to realize it has been caught this way because it's not being blocked. However, I don't see anything preventing scrapers from implementing heuristics to realize that.

discuss

order

pona-a|11 months ago

Detecting the actual AI generated content is not an easy problem. Not following deep links and recognizing the particular website template and structure is easier. I really feel a monoculture of anti-bot tools can defeat their effectiveness. When you have to optimize for Anubis, Nepenthes, Quixotic, and Cloudflare, each independently evolving and different in method and implementation, it might just be practical to give up.