top | item 46321894

(no title)

rst | 2 months ago

Anthropic's ahead of you -- the LLM that the reporters were interacting with here had an AI supervisor, "Seymour Cash", which uh... turned out to have some of the same vulnerabilities, though to a lesser extent. Anthropic's own writeup here describes the setup: https://www.anthropic.com/research/project-vend-2

discuss

order

UncleMeat|2 months ago

> Seymour Cash

The "everybody is 12" theory strikes again.