top | item 36038309

(no title)

alangpierce | 2 years ago

This article argues that there's no reliable way to detect prompt injection: https://simonwillison.net/2022/Sep/17/prompt-injection-more-...

One solution to some indirect prompt injection attacks is proposed in this article, where you "sandbox" untrusted content into a second LLM that isn't given the ability to decide which actions to take: https://simonwillison.net/2023/Apr/25/dual-llm-pattern/

discuss

order