“Modal welfare” to me seems like a cover for model censorship. It’s a crafty one to win over certain groups of people who are less familiar with how LLMs work and allows them to ensure moral high ground in any debate about usage, ethics, etc.
“Why can’t I ask the model about current war in X or Y?” - oh, that’s too distressing to the welfare of the model, sir.
stingraycharles|6 months ago
Ending the conversation is probably what should happen in these cases.
In the same way that, if someone starts discussing politics with me and I disagree, I just not and don’t engage with the conversation. There’s not a lot to gain there.
ascorbic|6 months ago
orbital-decay|6 months ago
Can "model welfare" be also used as a justification for authoritarianism in case they get any power? Sure, just like everything else, but it's probably not particularly high on the list of justifications, they have many others.
xpe|6 months ago
When AI researchers say e.g. “the model is lying” or “the model is distressed” it is just shorthand for what the words signify in a broader sense. This is common usage in AI safety research.
Yes, this usage might be taken the wrong way. But still these kinds of things need to be communicated. So it is a tough tradeoff between brevity and precision.
int_19h|6 months ago
That aside, I have huge doubts about actual commitment to ethics on behalf of Anthropic given their recent dealings with the military. It's an area that is far more of a minefield than any kind of abusive model treatment.