top | item 41387819

(no title)

tidenly | 1 year ago

A lot of people dislike LLMs and generative AI (fairly) and are reflexively trying to reach for tools in our legal framework, claiming it's obviously already illegal. I don't think this is going to work. Generative AI is quite obviously novel to anyone who isn't in denial - and claiming existing copyright laws are going to cover it seems like a lost cause.

We need new laws. Especially regarding deepfakes, it's shocking how many people think revenge porn laws and such are going to be enough here. Rather than just focusing on the data usage, we need more fundamental laws and rights, like the right to control representations of ourselves, like Japan has, where producing images or voice/video in your likeness is prosecutable straight out. Likewise we need laws that explicitly target data use for training that is separate to copyright.

The way LLMs are trained is obviously too similar to how humans learn, and the transformation and then output produce works that are novel based on that "learning", just like humans do. This is so fundamentally different to what copyright laws were made to cover, I find it infuriating how many people handwave these arguments away. Only in perfect 1-to-1 regurgitation does it even feel close to something copyright would be able to cover.

discuss

order

cowboylowrez|1 year ago

I'm one of the "dislikers" although the neural network stuff is itself an amazing tool in my opinion. I like to fall back on a much easier argument (IANAL and this is not legal advice), can these code generating things generate code without reading (training on) encumbered code?

Humans can learn syntax and basic programs then independent of any "similar code", humans can produce new algorithms that solve specific problems. Now sure, similar code can be searched for on the internet but the code is "attributed" and will likely contain a license. If the human copies it too closely, attribution and licensing rights come into play. The LLMs apparently just bail on attribution.

The way LLMs are trained is that they are fed an absurd amount of code, humans cannot train this way because the volumes of code to be read are too great.