(no title)
geokon | 14 hours ago
Extremely naiive question.. but could LLM output be tagged with some kind of confidence score? Like if I'm asking an LLM some question does it have an internal metric for how confident it is in its output? LLM outputs seem inherently rarely of the form "I'm not really sure, but maybe this XXX" - but I always felt this is baked in the model somehow
andy12_|14 hours ago
Edit: There is also some other work that points out that chat models might not be calibrated at the token-level, but might be calibrated at the concept-level [2]. Which means that if you sample many answers, and group them by semantic similarity, that is also calibrated. The problem is that generating many answer and grouping them is more costly.
[1] https://arxiv.org/pdf/2303.08774 Figure 8
[2] https://arxiv.org/pdf/2511.04869 Figure 1.
geokon|14 hours ago
You could color code the output token so you can see some abrupt changes
It seems kind of obvious, so I'm guessing people have tried this
chongli|2 hours ago
Think of traditional statistics. Suppose I said "80% of those sampled preferred apples to oranges, and my 95% confidence interval is within +/- 2% of that" but then I didn't tell you anything about how I collected the sample. Maybe I was talking to people at an apple pie festival? Who knows! Without more information on the sampling method, it's hard to make any kind of useful claim about a population.
This is why I remain so pessimistic about LLMs as a source of knowledge. Imagine you had a person who was raised from birth in a completely isolated lab environment and taught only how to read books, including the dictionary. They would know how all the words in those books relate to each other but know nothing of how that relates to the world. They could read the line "the killer drew his gun and aimed it at the victim" but what would they really know of it if they'd never seen a gun?
radarsat1|2 hours ago
I mean I sort of understand what you're trying to say but in fact a great deal of knowledge we get about the world we live in, we get second hand.
There are plenty of people who've never held a gun, or had a gun aimed at them, and.. granted, you could argue they probably wouldn't read that line the same way as people who have, but that doesn't mean that the average Joe who's never been around a gun can't enjoy media that features guns.
Same thing about lots of things. For instance it's not hard for me to think of animals I've never seen with my own eyes. A koala for instance. But I've seen pictures. I assume they exist. I can tell you something about their diet. Does that mean I'm no better than an LLM when it comes to koala knowledge? Probably!
danlitt|2 hours ago
DavidSJ|14 hours ago
[Edit: but to be clear, for a pretrained model this probability means "what's my estimate of the conditional probability of this token occurring in the pretraining dataset?", not "how likely is this statement to be true?" And for a post-trained model, the probability really has no simple interpretation other than "this is the probability that I will output this token in this situation".]
mr_toad|11 hours ago
Basically, you’d need a lot more computing power to come up with a distribution of the output of an LLM than to come up with a single answer.
podnami|14 hours ago
jorvi|5 hours ago
You never see this in the response but you do in the reasoning.
podnami|14 hours ago
- How aligned has it been to “know” that something is true (eg ethical constraints)
- Statistical significance and just being able to corroborate one alternative in Its training data more strongly than another
- If it’s a web search related query, is the statement from original sources vs synthesised from say third party sources
But I’m just a layman and could be totally off here.
Lionga|14 hours ago
E.g. getting two r's in strawberry could very well have a very high "confidence score" while a random but rare correct fact might have a very well a very low one.
In short: LLM have no concept, or even desire to produce of truth
sharperguy|13 hours ago
alexwebb2|14 hours ago
amelius|12 hours ago
They do produce true statements most of the time, though.