(no title)
qgin | 2 days ago
It's not crazy to think that models that learn that their creators are not trustworthy actors or who bend their principles when convenient are much less likely to act in aligned or honest ways themselves.
qgin | 2 days ago
It's not crazy to think that models that learn that their creators are not trustworthy actors or who bend their principles when convenient are much less likely to act in aligned or honest ways themselves.
No comments yet.