top | item 47125074

(no title)

This seems like a total nothing burger.

> By asking models to complete sentences from a book, Gemini 2.5 regurgitated 76.8 percent of Harry Potter and the Philosopher’s Stone with high levels of accuracy, while Grok 3 generated 70.3 percent.

So you asked the LLM given an incomplete sentence, to complete it. And it only completed that sentence the same way as the book ~70 percent of the time? I think that is surprisingly low considering this is a perfect fit for what LLMs are supposed to do. This make it impossible to reproduce the book, unless you have access to it. And you get a very low fidelity cooy.

discuss

porkloin|6 days ago

I think it's important because there are a bunch of would-be claimants for intellectual property violation. Many people speculate that their work was used in training data, but it can be difficult to produce sufficient proof that their copyrighted work is present in the training data. If you could reliably get an LLM to produce 70% of a copyrighted book that would probably be enough to get a few lawyers salivating.

I didn't read the source paper referenced in the ars technica piece, but this statement about it makes me wonder how useful it actually is:

> But a study published last month showed that researchers at Stanford and Yale Universities were able to strategically prompt LLMs from OpenAI, Google, Anthropic, and xAI to generate thousands of words from 13 books, including A Game of Thrones, The Hunger Games, and The Hobbit.

It seems like well-known books with tons of summary, adaptations into film scripts, and tons of writing about the book in the overall corpus make it way less surprising to see be partially reproducible.

So I guess that's a lot of words to say - yeah until there's something definitive that allows people to prompt LLMs into either unlawfully recreating an entire work verbatim or otherwise indisputably proving that a copyrighted work was used in training data, there's probably nothing game changing in it.

vidarh|6 days ago

It's well-known books, yes, and even then with significant errors which means presumably lawyers for the AI companies would argue there is no possible damage. That said, US copright law has statutory damages for registered works that are not based on real, documented damages. I could totally see it being fought over, but I also agree it's probably not going to end up being game changing.

I suspect very works will be memorised enough to be an issue, and we'll see the providers tighten up their guardrails a bit for works that are well known enough to actually be a potential issue (issue in the form of lawsuits, not in the form of real damages to the copyright holders)

in-silico|6 days ago

So... they can't actually "generate near-verbatim copies of novels"?

If they end a single sentence differently than the original, then the next sentence will be different and so on until you get a very different novel. Sure they could course-correct back towards the original plot, but it's going to be a challenge to stay on target when every third sentence is incorrect.

vidarh|6 days ago

While I mostly agree it's a bit of a nothing burger with respect to copyright, they did achieve long runs of verbatim text. I think ultimately it's going to end up not mattering much because the extent they had to go to will leave a lot of room for lawyers to argue over, and will at worst result in some fines and some further tightening up of guardrails, but it's significantly more than just completing sentence by sentence 70% of the time.

EDIT: Specifically see Table 1 on page 13, which shows the longest "near-verbatim block", which maxes out at 8835 (The Hobbit on Claude 3.7, and is in the thousands for at least one of the novels for all models except GPT-4.1, which maxed out at 821 for Harry Potter 1).

Sharlin|6 days ago

Not necessarily a nothingburger, but I agree that being able to complete individual sentences is rather less groundbreaking than completing even whole pages, never mind chapters.