top | item 47207457

(no title)

jsheard | 10 hours ago

LLMs might follow the frequencies of the training data in their raw form, but nobody uses raw LLMs, they use models which have been RLHFed to hell and back to bias them towards specific patterns. Then newer models were trained on the output of those RLHFed models, and further RLHFed, and so on, and so on.

discuss

amelius|10 hours ago

The H in RLHF stands for human. If humans didn't use the expression, then the LLM wouldn't.

jsheard|10 hours ago

In practice RLHF isn't a survey of every living humans personal style or preferences though, its purpose is to make the model more useful in the eyes of the vendor, mainly by getting cheap third-world labor to nudge the model according to the vendors instructions. You don't get a subservient, sycophantic and "safe" chat interface out of unstructured data without putting your thumb on the scale, hard.