top | item 34525620

(no title)

nickvincent | 3 years ago

This is a great point.

Not a lawyer, but as I understand the most likely way this question will be answered (for practical purposes in the US) is via the ongoing lawsuits against GitHub Copilot and Stable Diffusion and Midjourney.

I personally agree the creativity is in the source images and the training code, but think that unless it is decided that for legal purposes "AI Artifacts" (the files containing model weights, embedding, etc.) are just transformations of training data and therefore content and subject to the same legal standards as content, I see a lot of value in trying to let people license training and code and models separately. And if models are just transformations of content, I expect we can adjust the norms around licensing to achieve similar outcomes (i.e., trying to balance open sharing with some degree of creator-defined use restriction).

discuss

order

nl|3 years ago

The co-pilot and Dalle lawsuits aren't about if the training weights file can be copyrighted though (they are about if people's work can be freely used for training).

This is a different issue where the OP is arguing that the weights file is not eligible for copyright in the US. That's an interesting and separate point which I haven't really seen addressed before.

topynate|3 years ago

The two issues aren't exactly the same but they do seem intimately connected. When you consider what's involved in generating a weights file, it's a mostly mechanical process. You write a model, gather some data, and then train. Maybe the design of the model is patentable, or the model/training code is copyrightable (actually, I'm pretty sure it is), but the training process itself is just the execution of a program on some data. You can argue that what that program is doing is simply compiling a collection of facts, which means you haven't created a derivative work, but in that case the weights file is a database, by definition, so not copyrightable in the US. Or you can argue that the program is a tool which you're using to create a new copyrightable work. But in that case it's probably a derivative work.