top | item 45925697

(no title)

giuscri | 3 months ago

but it’s also true that the next sentence is generated by evaluating the whole conversation including the proposed solution.

my mental model is that the llm learned to predict what another person would say just by looking at that solution.

so it’s really telling whether the solution is likely (likely!) to be right or wrong

discuss

ben_w|3 months ago

Slight quibble, but the reinforcement learning from human feedback means they're trained (somewhat) on what the specific human asking the question is likely to consider right or wrong.

This is both why they're sycophantic, and also why they're better than just median internet comments.

But this is only a slight quibble, because what you say is also somewhat true, and why they have such a hard time saying "I don't know".

giuscri|3 months ago

idk… maybe we’ll found out the reason is that on the internet no one ends a conversation saying “i don’t know” :D