top | item 39746910 (no title) legrande | 1 year ago Also needed: ai.txt declaring which AI bots are allowed or disallowed to scrape content discuss order hn newest organsnyder|1 year ago Couldn't this be handled in robots.txt? omoikane|1 year ago > Couldn't this be handled in robots.txt?That would require knowing all the user agents that would scrape content, assuming that you want to only exclude AI scrapers and not search engines in general.OpenAI's user agent is GPTBot[1], I am not sure about the others.[1] https://news.ycombinator.com/item?id=37030568 load replies (1) 8organicbits|1 year ago Not well. You'd need to know which user-agent strings the AI scrapers use, which is impossible to enumerate.
organsnyder|1 year ago Couldn't this be handled in robots.txt? omoikane|1 year ago > Couldn't this be handled in robots.txt?That would require knowing all the user agents that would scrape content, assuming that you want to only exclude AI scrapers and not search engines in general.OpenAI's user agent is GPTBot[1], I am not sure about the others.[1] https://news.ycombinator.com/item?id=37030568 load replies (1) 8organicbits|1 year ago Not well. You'd need to know which user-agent strings the AI scrapers use, which is impossible to enumerate.
omoikane|1 year ago > Couldn't this be handled in robots.txt?That would require knowing all the user agents that would scrape content, assuming that you want to only exclude AI scrapers and not search engines in general.OpenAI's user agent is GPTBot[1], I am not sure about the others.[1] https://news.ycombinator.com/item?id=37030568 load replies (1)
8organicbits|1 year ago Not well. You'd need to know which user-agent strings the AI scrapers use, which is impossible to enumerate.
organsnyder|1 year ago
omoikane|1 year ago
That would require knowing all the user agents that would scrape content, assuming that you want to only exclude AI scrapers and not search engines in general.
OpenAI's user agent is GPTBot[1], I am not sure about the others.
[1] https://news.ycombinator.com/item?id=37030568
8organicbits|1 year ago