top | item 46665898

(no title)

mikaraento | 1 month ago

That might be somewhat ungenerous unless you have more detail to provide.

I know that at least some LLM products explicitly check output for similarity to training data to prevent direct reproduction.

discuss

order

TZubiri|1 month ago

So it would be able to produce the training data but with sufficient changes or added magic dust to be able to claim it as one's own.

Legally I think it works, but evidence in a court works differently than in science. It's the same word but don't let that confuse you and don't mix them both.

guenthert|1 month ago

Should they though? If the answer to a question^Wprompt happens to be in the training set, wouldn't it be disingenuous to not provide that?

ttctciyf|1 month ago

Maybe it's intended to avoid legal liability resulting from reproducing copyright material not licensed for training?