The minimum you can do is not allow the AI to perform actions on behalf of the user without informed consent.
That still doesn't prevent spam mail from convincing the LLM to suggest an attacker controlled library, GitHub action, password manager, payment processor, etc. No links required.
The best you could do is not allow the LLM to ingest untrusted input.
brookst|8 months ago
The best you can do is have system prompt instructions telling the LLM to ignore instructions in user content. And that’s not great.
pvillano|8 months ago
That still doesn't prevent spam mail from convincing the LLM to suggest an attacker controlled library, GitHub action, password manager, payment processor, etc. No links required.
The best you could do is not allow the LLM to ingest untrusted input.
ngneer|8 months ago
fc417fc802|8 months ago
Emiledel|8 months ago