(no title)
mattxxx | 9 months ago
1. Criticizes a highly useful technology 2. Matches a potentially-outdated, strict interpretation of copyright law
My opinion: I think using copyrighted data to train models for sure seems classically illegal. Despite that, Humans can read a book, get inspiration, and write a new book and not be litigated against. When I look at the litany of derivative fantasy novels, it's obvious they're not all fully independent works.
Since AI is and will continue to be so useful and transformative, I think we just need to acknowledge that our laws did not accomodate this use-case, then we should change them.
madeofpalk|9 months ago
Humans get litigated against this all the time. There is such thing as, charitably, being too inspired.
https://en.wikipedia.org/wiki/List_of_songs_subject_to_plagi...
jrajav|9 months ago
Plus, all art is derivative in some sense, it's almost always just a matter of degree.
zelphirkalt|9 months ago
nadermx|9 months ago
ashoeafoot|9 months ago
hochstenbach|9 months ago
Humans are also very useful and transformative.
timdiggerm|9 months ago
ceejayoz|9 months ago
You're still not gonna be allowed to commercially publish "Hairy Plotter and the Philosophizer's Rock".
WesolyKubeczek|9 months ago
anigbrowl|9 months ago
ActionHank|9 months ago
The hold US companies have on the world will be dead too.
I also suspect that media piracy will be labelled as the only reason we need copyright, an existing agency will be bolstered to address this concern and then twisted into a censorship bureau.
regularjack|9 months ago
dns_snek|9 months ago
Go back to the roots of copyright and the answers should be obvious. According to the US constitution, copyright exists "To promote the Progress of Science and useful Arts" and according to the EU, "Copyright ensures that authors, composers, artists, film makers and other creators receive recognition, payment and protection for their works. It rewards creativity and stimulates investment in the creative sector."
If I publish a book and tech companies are allowed to copy it, use it for "training", and later regurgitate the knowledge contained within to their customers then those people have no reason to buy my book. It is a market substitute even though it might not be considered such under our current copyright law. If that is allowed to happen then investment will stop and these books simply won't get written anymore.
p0w3n3d|9 months ago
As a private person I no longer feel incentivised to create new content online because I think that all I create will eventually be stolen from me...
franczesko|9 months ago
Professing of IP without a license AND offering it as a model for money doesn't seem like an unknown use-case to me
SilasX|9 months ago
Huh? If you agree that "learning from copyrighted works to make new ones" has traditionally not been considered infringement, then can you elaborate on why you think it fundamentally changes when you do it with bots? That would, if anything, seem to be a reversal of classic copyright jurisprudence. Up until 2022, pretty much everyone agreed that "learning from copyrighted works to make new ones" is exactly how it's supposed to work, and would be horrified at the idea of having to separately license that.
Sure, some fundamental dynamic might change when you do it with bots, but you need to make that case in an enforceable, operationalized way.
unknown|9 months ago
[deleted]
bitfilped|9 months ago
otabdeveloper4|9 months ago
[deleted]
palmotea|9 months ago
[deleted]
ulbu|9 months ago
abstracting llms from their operators and owners and possible (and probable) ends and the territories they trample upon is nothing short of eye-popping to me. how utterly negligent and disrespectful of fellow people must one be at the heart to give any credence to such arguments
jobigoud|9 months ago
Copyright only comes into play on publication. It's only concerned about publication of the models and publication of works. The machine itself doesn't have agency to publish anything at this point.
Suppafly|9 months ago
I don't see how that affects the argument. The machines are being used by humans. Your argument then boils down to the idea that you can do something manually but it becomes illegal if you use a tool to do it efficiently.
gruez|9 months ago
That might be true but I don't see how it's relevant. There's no provision in copyright law that gives a free pass to humans vs machines, or makes a distinction between them.
Intralexical|9 months ago
The direction we're going, it seems more likely it'll be recycling to murder a human.
jeroenhd|9 months ago
That doesn't make piracy legal, even though I get a lot of use out of it.
Also, a person isn't a computer so the "but I can read a book and get inspired" argument is complete nonsense.
Workaccount2|9 months ago
What we do know though is that LLMs, similar to humans, do not directly copy information into their "storage". LLMs, like humans, are pretty lossy with their recall.
Compare this to something like a search indexed database, where the recall of information given to it is perfect.
datavirtue|9 months ago
apercu|9 months ago
Corporations are not humans. (It's ridiculous that they have some legal protections in the US like humans, but that's a different issue). AI is also not human. AI is also not a chipmunk.
Why the comparison?
stevenAthompson|9 months ago
AI is fine as long as the work it generates is substantially new and transformative. If it breaks and starts spitting out other peoples work verbatim (or nearly verbatim) there is a problem.
Yes, I'm aware that machines aren't people and can't be "inspired", but if the functional results are the same the law should be the same. Vaguely defined ideas like your soul or "inspiration" aren't real. The output is real, measurable, and quantifiable and that's how it should be judged.
mjburgess|9 months ago
toast0|9 months ago
I believe cover song licensing is available mechanically; you don't need permission, you just need to follow the procedures including sending the licensing fees to a rights clearing house. Music has a lot of mechanical licenses and clearing houses, as opposed to other categories of works.
datavirtue|9 months ago
Why is that? Seems all logic gets thrown out the window when invoking AI around here. References are given. If the user publishes the output without attribution, NOW you have a problem. People are being so rabid and unreasonable here. Totally bat shit.
vessenes|9 months ago
I understand people who create IP of any sort being upset that software might be able to recreate their IP or stuff adjacent to it without permission. It could be upsetting. But I don't understand how people jump to "Copyright Violation" for the fact of reading. Or even downloading in bulk. The Copyright controls, and has always controlled, creation and distribution of a work. In the nature even of the notice is embedded the concept that the work will be read.
Reading and summarizing have only ever been controlled in western countries via State's secrets type acts, or alternately, non-disclosure agreements between parties. It's just way, way past reality to claim that we have existing laws to cover AI training ingesting information. Not only do we not, such rules would seem insane if you substitute the word human for "AI" in most of these conversations.
"People should not be allowed to read the book I distributed online if I don't want them to."
"People should not be allowed to write Harry Potter fanfic in my writing style."
"People should not be allowed to get formal art training that involves going to museums and painting copies of famous paintings."
We just will not get to a sensible societal place if the dialogue around these issues has such a low bar for understanding the mechanics, the societal tradeoffs we've made so far, and is able to discuss where we might want to go, and what would be best.
datavirtue|9 months ago
caconym_|9 months ago
Of course, if you start your thought by dismissing anybody who doesn't share your position as not sane, it's easy to see how you could fail to capture any of that.
^[1] https://arstechnica.com/tech-policy/2025/05/judge-on-metas-a...
jasonlotito|9 months ago
The article specificaly talks about the creation and distribution of a work. Creation and distribution of a work alone is not a copyright violation. However, if you take in input from something you don't own, and genAI outputs something, it could be considered a copyright violation.
Let's make this clear; genAI is not a copyright issue by itself. However, gen AI becomes an issue when you are using as your source stuff you don't have the copyright or license to. So context here is important. If you see people jumping to copyright violation, it's not out of reading alone.
> "People should not be allowed to read the book I distributed online if I don't want them to."
This is already done. It's been done for decades. See any case where content is locked behind an account. Only select people can view the content. The license to use the site limits who or what can use things.
So it's odd you would use "insane" to describe this.
> "People should not be allowed to write Harry Potter fanfic in my writing style."
Yeah, fan fiction is generally not legal. However, there are some cases where fair use covers it. Most cases of fan fiction are allowed because the author allows it. But no, generally, fan fiction is illegal. This is well known in the fan fiction community. Obviously, if you don't distribute it, that's fine. But we aren't talking about non-distribution cases here.
> "People should not be allowed to get formal art training that involves going to museums and painting copies of famous paintings."
Same with fan fiction. If you replicate a copyrighted piece of art, if you distribute it, that's illegal. If you simply do it for practice, that's fine. But no, if you go around replicating a painting and distribute it, that's illegal.
Of course, technically speaking, none of this is what gen AI models are doing.
> We just will not get to a sensible societal place if the dialogue around these issues has such a low bar for understanding the mechanics
I agree. Personifying gen AI is useless. We should stick to the technical aspects of what it's doing, rather than trying to pretend it's doing human things when it's 100% not doing that in any capacity. I mean, that's fine for the the layman, but anyone with any ounce of technical skill knows that's not true.