top | item 42833791

(no title)

ren_engineer | 1 year ago

not sure why people are surprised, it's been known a long time that RLHF essentially lobotomizes LLMs by training them to give answers the base model wouldn't give. Deepseek is better because they didn't gimp their own model

discuss

order

No comments yet.