top | item 42478389

(no title)

anavat | 1 year ago

My guess is this is an artifact of the RLHF part of the training. Answers like "I don't know" or "let me think and let's catch on this next week" are flagged down by human testers, which eventually trains LLM to avoid this path altogether. And it probably makes sense because otherwise "I don't know" would come up way too often even in cases where the LLM is perfectly able to give the answer.

discuss

gf000|1 year ago

I don't know, that seems like a fundamental limitation. LLMs don't have any ability to do reflection on their own knowledge/abilities.

ben_w|1 year ago

Humans aren't very aware of their limits, either.

Even the Dunning-Kruger effect is, ironically, widely misunderstood by people who are unreasonably confident about their knowledge.