(no title)
anatnom | 2 years ago
But I also have extreme doubts that proper redaction can be done robustly. The design mockup image suggests that this will all be done as a step subsequent to response generation. Given the abundance of "prompt jailbreaks", a determined adversary is going to get around this.
No comments yet.