top | item 38782173

(no title)

mvcalder | 2 years ago

Does anyone know what the copyright status of LLM generated content is? That is, if I feed a NYT article into GPT4 and say, summarize this article, and then publish that summary, is there argument or precedent that says that is or is not copyright infringement? Asking for a friend.

discuss

zozbot234|2 years ago

Typically if you ask a chatbot to "summarize" something, it will paraphrase the original closely enough that it would be considered plagiarism and copyright infringement. To avoid that, it's required to distill the relevant ideas contained in the text, and expound on them in a way that's not dependent on how the text itself was expressed, structured or organized. You would need to tell the model to do this over multiple steps, and then derive a rephrased article without looking at the original at all. (Which is not really possible if the article was in the AI's training set, as is the case here.)

mensetmanusman|2 years ago

No one knows, this is new territory.

Maybe the fermi filter is litigating an AI that would otherwise save humanity.

kevindamm|2 years ago

Or the filter could be the other way, failing to litigate an AI to decelerate it's progress, and a risk of augmenting the underlying society too much too quickly.

gizmo385|2 years ago

> an AI that would otherwise save humanity.

Just to clarify, this is sarcasm right?

paulddraper|2 years ago

There is no difference between an LLM summarizing a copyrighted work and a Wikipedia contributor summarizing a copyrighted work.

Wikipedia has some words on how summaries related to copyright law: https://en.wikipedia.org/wiki/Wikipedia:Plot-only_descriptio...

unknown|2 years ago

[deleted]

lb4r|2 years ago

Technically, you just send a request to OpenAI and they are the ones who feed it into GPT4. Although I'd argue this is irrelevant to your question, the law works in mysterious way so perhaps it carries some importance.