top | item 43183086

(no title)

drsim | 1 year ago

It is RLHF if I understand correctly.

discuss

order

chmod775|1 year ago

The Venn diagram of people to whom this comment contains no new information and those who know what "RLHF" means is almost a perfect circle.

For anyone not part of that intersection: RLHF means reinforcement learning from human feedback.

grumpopotamus|1 year ago

Well, HF.

kleiba|1 year ago

Right!

I suppose it's roughly as much "training AI models" as labeling training data is "training supervised models".