top | item 46411438

(no title)

NoraCodes | 2 months ago

You - and many other commentors in this thread - misunderstand the legal theory under which AI companies operate. In their view, training their models is allowed under fair use, which means it does not trigger copyright-based licenses at all. You cannot dissuade them with a license.

discuss

order

brookst|2 months ago

While I think OP is shortsighted in their desire for an “open source only for permitted use cases” license, it is entirely possible that training will be found to not be fair use, and/or that making and retaining copies for training purposes is not fair use.

Perhaps you can’t dissuade AI companies today, but it is possible that the courts will do so in the future.

But honestly it’s hard for me to care. I do not think the world would be better if “open source except for militaries” or “open source except for people who eat meat” license became commonplace.

gus_massa|2 months ago

The problem are "viral" licences. Must the code generated by an AI trained with GPL code be released with a GPL licence?

Also, can an AI be trained with the leaked source of Windows(R)(C)(TM)?

testdelacc1|2 months ago

Open source except for people who have downvoted any of my comments.

I agree with you though. I get sad when I see people abuse the Commons that everyone contributes to, and I understand that some people want to stop contributing to the Commons when they see that. I just disagree - we benefit more from a flourishing Commons, even if there are free loaders, even if there are exploiters etc.

Wowfunhappy|2 months ago

Of course, if the code wasn't available in the first place, the AI wouldn't be able to read it.

It wouldn't qualify as "open source", but I wonder if OP could have some sort of EULA (or maybe it would be considered an NDA). Something to the effect of "by reading this source code, you agree not to use it as training data for any AI system or model."

And then something to make it viral. "You further agree not to allow others to read or redistribute this source code unless they agree to the same terms."

morpheuskafka|2 months ago

My understanding is that you can have such an agreement (basically a kind of NDA) -- but if courts ruled that AI training is fair use, it could never be a copyright violation, only a violation of that contract. Contract violations can only receive economy damages, not the massive statutory penalties that copyright does.

archagon|2 months ago

Having a license that specifically disallows a legally dubious behavior could make lawsuits much easier in the future, however. (And might also incentivize lawyers to recommend avoiding this code for LLM training in the first place.)

Workaccount2|2 months ago

People think that code is loaded into a model, like a massive available array of "copy+paste" snippets.

It's understandable that people think this, but it is incorrect.

As an aside, Anthropic's training was ruled fair use, except the books they pirated.

stefan_|2 months ago

Fair use is a defense to copyright violation, but highly dependent on the circumstances in which it happens. There certainly is no blanket "fair use for AI everything".