top | item 36865146

(no title)

I think for small amounts of text there's no way around it being indistinguishable to a machine and not distinguishable to a human. There just aren't that many combinations of words that still flow well. Furthermore as more and more people use it I think we'll find some humans changing their speech patterns subconsciously more to mimic whatever it does. I imagine with longer text there will be things they'll be able to find, but, I think it will end up being trivial for others to detect what those changes are and then modifying the result enough to be undetectable.

discuss

jerf|2 years ago

I think for this sort of problem it is more productive to think in terms of the amount of text necessary for detection, and how reliable such a detection would be, than a binary can/can't. I think similarly for how "photorealistic" a particular graphics tech is; many techs have already long passed the point where I can tell at 320x200 but they're not necessarily all there yet at 4K.

LLMs clearly pass the single sentence test. If you generate far more text than their window, I'm pretty sure they'd clearly fail as they start getting repetitive or losing track of what they've written. In between, it varies depending on how much text you get to look at. A single paragraph is pretty darned hard. A full essay starts becoming something I'm more confident in my assessment.

It's also worth reminding people that LLMs are more than just "ChatGPT in its standard form". As a human trying to do bot detection sometimes, I've noticed some tells in ChatGPT's "standard voice" which almost everyone is still using, but once people graduate from "Write a blog post about $TOPIC related to $LANGUAGE" to "Write a blog post about $TOPIC related to $LANGUAGE in the style of Ernest Hemmingway" in their prompts it's going to become very difficult to tell by style alone.