(no title)
fjdjshsh | 1 year ago
This reminds of this very interesting paper [1] that finds that it's fairly "easy" to uncensor a model (modify it's refusal thingy)
[1] https://www.reddit.com/r/LocalLLaMA/comments/1cerqd8/refusal...
fjdjshsh | 1 year ago
This reminds of this very interesting paper [1] that finds that it's fairly "easy" to uncensor a model (modify it's refusal thingy)
[1] https://www.reddit.com/r/LocalLLaMA/comments/1cerqd8/refusal...
No comments yet.