top | item 41777942

(no title)

> during pre-training, there is never an incentive for the model to say "I don't know" because it would be penalized. the model is incentivized to make an educated guess

The guess can be "I don't know". The base LLM would generally only say I don't know if it "knew" that it didn't know, which is not going to be very common. The tuned LLM would be the level responsible for trying to equate a lack of understanding to saying "I don't know"

discuss

No comments yet.