(no title)
alangpierce | 2 years ago
One solution to some indirect prompt injection attacks is proposed in this article, where you "sandbox" untrusted content into a second LLM that isn't given the ability to decide which actions to take: https://simonwillison.net/2023/Apr/25/dual-llm-pattern/
SkyPuncher|2 years ago
There are nearly infinite ways to word an attack. You can only protect against the most common of them.
cubefox|2 years ago
https://news.ycombinator.com/item?id=35929145