(no title)
itsnotlupus | 2 years ago
I can't tell where this is going to land. Right now, we're seeing a number of parties trying to put metaphorical barbed wire in the newfound prairie of ML models, each struggling to influence the prevalent wisdom of how IP rights should apply to this context.
We could easily end up in a universe where LLMs are a licensing minefield where every copyright owner of any part of their training data gets rights on the model, become essentially unmanageable without relying on helpful licensing middlemen that smooth out the right for LLMs to exist, at a cost.
We could just as well end up with LLMs being recognized as not being derivative works themselves, albeit able to generate derivative works, a much less advantageous situation for creatives and their middlemen who see their creative output as being pirated, to reuse a familiar term of IP rights propaganda.
It'd be a little surprising to me if we ended up in a situation where the work needed to produce good LLMs wasn't associated with any commensurate IP rights on the results, and I expect the megacorps investing billions into this will find it in their heart to throw a few millions toward lobbying efforts to ensure that this isn't the outcome.
abtinf|2 years ago
You can enter into general business contracts that govern how the parties make use of data. This happens all the time with all kinds of data sources: business listings, GIS, credit scores, etc.
If I copy of this kind of data, I might be breaking a contract and committing a tort, but I am not violating copyrights. If a third party gets the data without a contract in place and makes use of it, they are not violating copyright either; the liability falls on the contracted party that let the data get out.
But licenses of the kind proposed on models are inapplicable. Imagine how bizarre it would be if the phone book came with a license stating “you may only use the information for non-commercial purposes.” The phone book publisher would get laughed out of court and maybe even penalized for frivolous lawsuits.