top | item 46935971

(no title)

Xcelerate | 22 days ago

“You’re not [X]—you’re [Y]” is the one that drives me nuts. [X] is typically some negative characterization that, without RLHF, the model would likely just state directly. I get enough politics/subtext from humans. I’d rather the LLM just call it straight.

discuss

No comments yet.