top | item 41382833

(no title)

apantel | 1 year ago

> Microsoft using copilot to license-launder the github corpus

Isn’t the game with licenses that you must specify what can and cannot be done with the code? It’s a big issue that no one prior to now had the foresight to forbid ML training on code without attribution. If the licenses going back 30 years had that stipulation then it would be easy to take down Copilot and ChatGPT. But the licenses simply don’t cover the use case of training a neural net, so it’s probably going to slip through the cracks just like SaaS slipped through the cracks of the GPL by not distributing code, hence the need for AGPL. So I’m sure we’ll see these kinds of clauses added to licenses going forward, but they can’t be applied retroactively.

The irony in all this is that from the start, open source licensing has been a game of wits where software creators try to cleverly use copyright as a lever of control. Well, they weren’t clever enough. They missed ML training and didn’t forbid it. As a result they’ve basically lost the whole game.

discuss

order

NoraCodes|1 year ago

> Isn’t the game with licenses that you must specify what can and cannot be done with the code?

No, not at all. Microsoft's argument is that training an LLM on code is fair use, and thus doesn't trigger copyright-based licensing at all. That's why they include unlicensed code in Copilot, which under a training-triggers-copyright theory they have no right to at all.

apantel|1 year ago

Thanks for the clarification.