One of the things that a lot of LLM scrapers are fetching are git repositories. They could just use git clone to fetch everything at once. But instead, they fetch them commit by commit. That's about as static as you can get, and it is absolutely NOT a non-issue.
LoganDark|2 days ago
KolmogorovComp|2 days ago
This is wrong. Git does store full copies.
neoromantique|2 days ago
Prebuild statically the most common commits (last XX) and heavily rate limit deeper ones
PaulDavisThe1st|1 day ago
2. 1M independent IPs hitting random commits from across a 25 year history is not, in fact, "easy to solve". It is addressable, but not easy ...
3. why should I have to do anything at all to deal with these scrapers? why is the onus not on them to do the right thing?