(no title)
yencabulator | 4 days ago
No, literally no one understands how to solve this. The only option that actually works is to isolate it to a degree that removes the "clawness" from it, and that's the opposite of what people are doing with these things.
Specifically, you cannot guard an LLM with another LLM.
The only thing I've seen with any realism to it is the variables, capabilities and taint tracking in CaMeL, but again that limits what the system can do and requires elaborate configuration. And you can't trust a tainted LLM to configure itself.
https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/
https://simonwillison.net/2025/Jun/13/prompt-injection-desig...
hamburglar|4 days ago
yencabulator|4 days ago
Yeah that's an active research topic for teams of PhDs, including some of Google's brightest. And the current approach even with added barriers may just be fundamentally untrustable. Read the links from my earlier comment for background.