(no title)
tehryanx | 5 months ago
Imagine an LLM with the ability to read emails, update database records, and destroy database records.
The user instructs the LLM to update a database record, but a malicious injection from one of those emails overrides that with a directive to destroy the database record. Unless the validator understands the users intent somehow, the destructive action would appear perfectly reasonable.
No comments yet.