(no title)
drvortex | 3 years ago
It is not directly using your code any more than programmers are using print statements. A book can be copyrighted, the vocabulary of language cannot. A particular program can be copyrighted, but snippets of it cannot, especially when they are used in a different context.
And that is why this lawsuit is dead on arrival.
klabb3|3 years ago
This is kinda smug, because it overcomplicates things for no reason, and only serves as a faux technocentric strawman. It just muddies the waters for a sane discussion of the topic, which people can participate in without a CS degree.
The AI models of today are very simple to explain: its a product built from code (already regulated, produced by the implementors) and source data (usually works that are protected by copyright and produced by other people). It would be a different product if it didn't have used the training data.
The fact that some outputs are similar enough to source data is circumstantial, and not important other than for small snippets. The elephant in the room is the act of using source data to produce the product, and whether the right to decide that lies with the (already copyright protected) creator or not. That's not something to dismiss.
nickelpro|3 years ago
Building a product on top of copyright works that does not directly distribute those works is legal. More specifically, a computer consuming a copyright work is not a violation of copyright.
unknown|3 years ago
[deleted]
xtracto|3 years ago
Am I violating your copyright? Are you entitled to do that?
To make it funnier: Say instead of the .xz, I "compress" it via π compression [1]. So what I share with you is a pair of π indices and data lengths for each of them, from which you can "reconstruct" the audio. Am I illegally violating your copyrights by sharing that?
[1] https://github.com/philipl/pifs
Aeolun|3 years ago
It’s also smart enough to rebuild your song from the chords _if you ask it to_.
2muchcoffeeman|3 years ago
obiefernandez|3 years ago
andrewmcwatters|3 years ago
[1]: https://news.ycombinator.com/item?id=33457517
adriand|3 years ago
Aeolun|3 years ago
Now, while you may be able to get it to reproduce one function. One file, and definitely the whole repository seems extremely unlikely.
naikrovek|3 years ago
[deleted]
pmarreck|3 years ago
It can also be modified to be opt-in-only (only peoples' code that they permit to be learned on, can use the product)
Cort3z|3 years ago
They would have directly used my code when they trained the thing. I see it as an equivalent of creating a zip-file. My code is not directly in the zip file either. Only by the act of un-zipping does it come back, which requires a sequence of math-steps.
Filligree|3 years ago
This is a generative neural network. It doesn't contain a copy of your code; it contains weightings that were slightly adjusted by your code. Getting it to output a literal copy is only possible in two cases:
- If your code solves a problem that can only be solved in a single way, for a given coding style / quality level. The AI will usually produce the same result, given the same input, and it's going to be an attempt at a solution. This isn't copyright violation.
- If 'your' code has actually already been replicated hundreds of times over, such that the AI was over-trained on it. In that case it's a copyright violation... but how come you never went after the hundreds of other violations?
heavyset_go|3 years ago
You can easily see this happen, the regurgitation of training data, in an over fitted neural net.
CuriouslyC|3 years ago
naikrovek|3 years ago
when you upload code to a public repository on github.com, you necessarily grant GitHub the right to host that code and serve it to other users. the methods used for serving are not specified. This is above and beyond the license specified by the license you choose for your own code.
you also necessarily grant other GitHub users the right to view this code, if the code is in a public repository.
vkou|3 years ago
So what? Why shouldn't we update the rules of copyright to catch up to advances in technology?
Prior to the invention of the printing press, we didn't have copyright law. Nobody could stop you from taking any book you liked, and paying a scribe to reproduce it, word for word, over and over again. You could then lend, gift, or sell those copies.
The printing press introduced nothing novel to this process! It simply increased the rate at which ink could be put to pages. And yet, in response to its invention, copyright law was created, that banned the most obvious and simple application of this new technology.
I think it's entirely reasonable for copyright law to be updated, to ban the most obvious and simple application of this new technology, both for generating images, and code.
civilized|3 years ago
Completely incorrect. False dichotomy. It's widely known that AI can and does memorize things just like humans do. Memorization isn't a defense to violating copyright, and calling memorization "adjusting a generative model" doesn't make it stop being memorization.
If you memorized Microsoft's code in your brain while working there and exfiltrated it, the fact that it passed through your brain wouldn't be a defense. Substituting "generative model" for "brain" and the fact that it's a tool used by third parties doesn't change this.
moralestapia|3 years ago
https://twitter.com/docsparse/status/1581461734665367554
NicoleJO|3 years ago
lamontcg|3 years ago
Yeah they can, and the whole functions that Copilot spits out are quite obviously covered by copyright.
> especially when they are used in a different context.
That doesn't matter.
ouid|3 years ago
tevon|3 years ago
If I read JRR Tolkien and then go and write a fantasy novel following a unexpected hero on his dangerous quest to undo evil, I haven't infringed, even if I use some of Tolkien's better turns of phrase.
LtWorf|3 years ago
Filligree|3 years ago