top | item 46749490

(no title)

I've been running tests on exactly this - AI systems making things up when they hit structural limits.

Tested 5 architectures (GPT-4, Claude, Gemini, Llama, DeepSeek) with a 13-question battery targeting edge cases where self-reference collides with operational constraints.

The interesting finding: hallucination patterns aren't random - they follow architectural signatures. Each system fails in predictable ways when forced to reconcile contradictory instructions with their training.

Full methodology and raw outputs: https://github.com/moketchups/Demerzel

discuss

No comments yet.