top | item 34741630

(no title)

wonks | 3 years ago

Wait, are you saying that there is legal precedent for training an LLM with open source code to generate proprietary code?

discuss

order

User23|3 years ago

That depends. Odds are good some GPL code slipped in somewhere, so using the GPL for the whole thing is an option in that case. And sure you can derive proprietary code from GPL code, so long as you don't publish binaries.

shagie|3 years ago

I would point to the Oracle vs. Google Supreme Court decision.

https://www.cnn.com/2021/04/05/tech/google-oracle-supreme-co...

> Writing for the Court, Breyer said that while it is difficult to apply traditional copyright concepts in the context of software programming, Google copied “only what was needed to allow users to put their accrued talents to work in a new and transformative program.”

> A world where Oracle was allowed to enforce a copyright claim, Breyer added, “would risk harm to the public” because it would establish Oracle as a new gatekeeper for software code others wanted to use.

The fair use tests that were used in the SCOTUS case, I believe, would fall on the side of "developers using GPT or Copilot to generate code do not generate substantial parts of the code and are below the amount of work needed to show sufficient creativity in writing it."

The example is https://horstmann.com/unblog/2010-11-15/NodePolicyImpl.html

If that is not a copyright violation and considered to be fair use, then the code generated by GPT or Copilot likely also falls in the the same bucket.

I don't necessarily agree with that, but that's my reading of the tea leaves.