top | item 36038309

(no title)

This article argues that there's no reliable way to detect prompt injection: https://simonwillison.net/2022/Sep/17/prompt-injection-more-...

One solution to some indirect prompt injection attacks is proposed in this article, where you "sandbox" untrusted content into a second LLM that isn't given the ability to decide which actions to take: https://simonwillison.net/2023/Apr/25/dual-llm-pattern/

discuss

SkyPuncher|2 years ago

I see absolutely no way prompt injection can be fully protected against.

There are nearly infinite ways to word an attack. You can only protect against the most common of them.

cubefox|2 years ago

What about this approach?

https://news.ycombinator.com/item?id=35929145