(no title)
hassleblad23 | 1 year ago
I wouldn't depend on the numerical value of the confidence score itself though. There is no way for the LLM to caliberate its confidence score wrt. multiple invocations on different data. I have found this metric to be mostly useless.
It works fine as a proxy to induce some thinking though.
No comments yet.