(no title)
alexedw | 3 years ago
A lot of suggestions here talk about the consistent stylistic choices that ChatGPT makes, like it's lists or other particular mannerisms. I'd argue these are simply artefacts of it being fine-tuned on a large number of 'well-behaved' examples from Open AI. This phenomena is called partial mode collapse, this article does a great job discussing it with respect to GPT-3 [0].
Of course you could train a model to detect when this mode-collapse occurs to detect ChatGPT. The un-finetuned model, however, does not have these problems, so it's only a matter of OpenAI improving their fine-tuning dataset to return to an 'undetectable' AI.
[0] https://www.lesswrong.com/posts/t9svvNPNmFf5Qa3TA/mysteries-...
No comments yet.