(no title)
enord
|
2 years ago
Listen, most website and book-authors want to be indexed by google. It brings potential audience their way, so most don’t make use of their _right_ to be de-listed. For these models, there is no plausible benefit to the original creators, and so one has to argue they have _no_ such right to be “de-listed” in order to get any training data currently under copyright.
Ukv|2 years ago
The Authors Guild lawsuit against Google Books ended in a 2015 ruling that Google Books is fair use and as such they don't have a right to be de-listed. It's not the case that they have a right to be de-listed but choose not to make use of it.
The same would apply if collation of data for machine learning datasets is found to be fair use.
> one has to argue they have _no_ such right to be “de-listed” in order to get any training data currently under copyright.
Datasets I'm aware of already have respected machine-readable opt-outs, so if that were to be legally enforced (as it is by the EU's DSM Directive for commercial data mining) I don't think it'd be the end of the world.
There's a lot of power in a default; the set of "everything minus opted-out content" will be significantly bigger than "nothing plus opted-in content" even with the same opinions.
enord|2 years ago
The (quite entertaining) saga of Nightshade tells a story about what is going to be content creators “default position” going forward and everyone else will follow. You would be a fool not to, the AI companies are trying to end run you, using your own content, and make a profit without compensating you and leave you with no recourse.