(no title)
maronato | 7 months ago
Interestingly, it’s the only LLM I’ve seen behave that way. Others simply acknowledge the failure and, after a few hints, eventually get everything working.
Claude just hopes I won’t notice its tricks. It makes me wonder what else it might try to hide when misalignment has more serious consequences.
No comments yet.