top | item 43271883

(no title)

I have noticed that asking an LLM to output a confidence score and the reason for assigning the confidence score, works really well. These are tangential to the actual task, but still improve the quality.

I wouldn't depend on the numerical value of the confidence score itself though. There is no way for the LLM to caliberate its confidence score wrt. multiple invocations on different data. I have found this metric to be mostly useless.

It works fine as a proxy to induce some thinking though.

discuss

No comments yet.