(no title)
kouteiheika | 1 month ago
All of this "security" and "safety" theater is completely pointless for open-weight models, because if you have the weights the model can be fairly trivially unaligned and the guardrails removed anyway. You're just going to unnecessarily lobotomize the model.
Here's some reading about a fairly recent technique to simultaneously remove the guardrails/censorship and delobotomize the model (it apparently gets smarter once you uncensor it): https://huggingface.co/blog/grimjim/norm-preserving-biprojec...
ronsor|1 month ago
https://devblogs.microsoft.com/oldnewthing/20060508-22/?p=31...
avadodin|1 month ago
nottorp|1 month ago
Interesting, that has always been my intuition.
cluckindan|1 month ago
hthryrbr|1 month ago
Every single one of the liberated models is more stupid than the original models in general, outside of the area of censorship
unknown|1 month ago
[deleted]