top | item 46942844

(no title)

vohk | 21 days ago

I can't offer an example of code, but considering researchers were able to cause models to reproduce literary works verbatim, it seems unlikely that a git repository would be materially different.

https://www.theatlantic.com/technology/2026/01/ai-memorizati...

discuss

order

jmt710|18 days ago

These arguments absolutely infuriate me. You're code is not that unique. Lots of people write the same snippet everyday and have no idea that somebody else just wrote the same thing.

It's such a crock that you can somehow claim you're the only person who can write that snippet and now everyone else owes you something. No. No they don't. Get over it.

Writing a book is different. Lifting pages or chapters is different because it's much harder for two people to write the exact same thing. Code is code, it follows a formula and a everyone uses that formula.

20k|17 days ago

Writing an exact copy of a nontrivial function by mistake is so rare that i've never seen it happen in 20 years of programming

thedevilslawyer|21 days ago

Assuming that even works from a researcher's perspective, it's working back from a specific goal. There's 0 actual instances (and I've been looking) where verbatim code has been spat out.

It's a convenient criticism of LLMs, but a wrong one. We need to do better.

latexr|20 days ago

> There's 0 actual instances (and I've been looking) where verbatim code has been spat out.

That’s not true. I’ve seen it happen and remember reports where it was obvious it happened (and trivial to verify) because the LLM reproduced the comments with source information.

Either way, plagiarism doesn’t require one to copy 100% verbatim (otherwise every plagiarist would easily be off the hook). It still counts as plagiarism if you move a space or rename a variable.

https://xcancel.com/DocSparse/status/1581461734665367554

https://xcancel.com/mitsuhiko/status/1410886329924194309

> We need to do better.

I agree. We have to start by not dismissing valid criticisms by appealing to irrelevant technicalities which don’t excuse anything.

thechao|20 days ago

I don't know code examples, but this tracks, for me. Anytime I have an agent write something "obvious" and crazy hard -- say a new compiler for a new language? Golden. I ask it to write a fairly simple stack invariant version of an old algorithm using a novel representation (topology) using a novel construction (free module) ... zip. It's 200loc, and after 20+ attempts, I've given up.