top | item 35960347

(no title)

htpltr | 2 years ago

Antitrust is one thing, but by cleanroom implementation standards (one team reads the source and writes a spec, another team writes the code) CoPilot is illegal to begin with.

CoPilot reads and rearranges the IP that was created by millions of people who were working very hard and did not anticipate a code laundering machine when they wrote the code and the licenses.

discuss

order

unreal37|2 years ago

That's quite an extreme set of statements, and I very much doubt what you consider "illegal" is actually illegal.

When you publish something for others to view (text, images, code, whatever), others are allowed to view it. You can't anticipate how others view it, with their eyes or with screenreaders to assist. You can't stop them from reading it, thinking about it, discussing it with their friends, taking notes, summarizing it. You can't stop people from learning from your published content or recognizing patterns between it and other similar things.

Sorry, but you can't create a license that says "I will allow you to view this but you cannot learn from it. If you learn from it, you need to pay me."

belorn|2 years ago

Learning is very different from copying. I can take a movie and converts it to different formats and resolutions. I can use an AI algorithms to remove rough edges, and even add color to images which was taken in black and white. None of that would be covered by using the word learning, even if the program takes the movie as input and learns from it and outputs a work with is completely different from the original.

The word that seems to fit best is transforming and adapting. In order to adapt something, one has to first learn from the original in order to produce the derivative work. This is however covered by copyright, since the transforming and adapting is still considered a form of copying even if all people did was learning and producing something unique but similar to the original.

The license can say that "I will allow you to view this but you cannot create a derviate work from it".

mrtranscendence|2 years ago

This isn’t about a person learning, however. This is about developing an algorithm through the inclusion of GPL licensed code, that might — and has — verbatim emitted that code. Those seem materially different to me.

kmeisthax|2 years ago

Clean room is not the actual requirement for avoiding copyright infringement in reverse engineering. There have been several notable cases in which clean room practices were either not followed or outright disregarded, but the resulting product was considered to be non-infringing anyway[0].

Furthermore, while lots of hard work was put into the code that CoPilot used, that hard work was specifically donated with the intent that the code be reused. The only hard requirement being that the code remain free. The thing people are angry about with CoPilot is that it's a hosted OpenAI product with no freely-available model weights, and that generated code might be regurgitated from training data in some cases[1]. If CoPilot was actually open AI, nobody would be suing over it.

[0] In Sony v. Connectix, it was found that Connectix actually tried clean-room, black-box analysis of the PlayStation ROM, but abandoned it in favor of disassembling the whole thing. Connectix was still ruled non-infringing.

[1] Most egregiously, the comment "evil floating point bit level hacking" will make it spit out Quake III source. Microsoft worked around this by explicitly banning that particular phrase, which is just stupid.

williamcotton|2 years ago

Clean room implementations are there to make sure that none of the arbitrary, artistically expressive parts of the code are inadvertently copied.

Class structure, file structure, APIs…

amoss|2 years ago

Clean implementation is an approach to guarantee a lack of pollution. It is not the minimum level necessary to avoid it.