top | item 41756480

(no title)

Related to this in asked LLMs to directly solve the same riddle but then obfuscated the riddle so it wouldn’t match training data and as a final test added extraneous information to distract them.

Outside of o1, simple obfuscation was enough to throw off most of the group.

The distracting information also had a relevant effect. I don’t think LLMs are properly fine tuned for prompters lying to them. With RAG putting “untrusted prose” into the prompt that’s a big issue.

https://hackernoon.com/ai-loves-cake-more-than-truth

discuss

No comments yet.