De minimis is a longstanding defense in copyright law. If you are copying very little from very many works, as is the case when you turn multiple petabytes into a few gigabytes of neural network weights, you are in the clear. The problem arises when models overfit and spit out almost perfect copies of the training data.
belorn|3 years ago
For example, I could take a massive 8k video and covert it into a very small 144p youtube video. Am I in the clear simply because the output is tiny compared to the input? Similar I could take a huge studio master copy of a song and convert it to a very small and rather compressed (distorted) mp3.
I partially agree that some of the problem is when perfect copies are spit out by the models, but I do think there is a bigger problem. Copyright is a complex concept that can't be defined exclusively by a single metric like size, and any mathematically definition will in the end be killed if large copyright holders feel threatened by it.
sdenton4|3 years ago
"Transformative Use" is a major consideration in fair use copyright: https://en.wikipedia.org/wiki/Transformative_use
ML models do not supplant the pre-existing work, and provide fundamentally new modalities. Transformative use seems like a slam dunk to me, but I guess we'll see what the Supremes decide in twenty years or so...
Animats|3 years ago
[1] https://petapixel.com/2023/01/17/getty-images-is-suing-ai-im...
phoe-krk|3 years ago