Show HN: Klarity – Open-source tool to analyze uncertainty/entropy in LLM output
132 points| mrciffa | 1 year ago |github.com
What Klarity does:
- Real-time analysis of model uncertainty during generation - Dual analysis combining log probabilities and semantic understanding - Structured JSON output with actionable insights - Fully self-hostable with customizable analysis models
The tool works by analyzing each step of text generation and returns a structured JSON:
- uncertainty_points: array of {step, entropy, options[], type} - high_confidence: array of {step, probability, token, context} - risk_areas: array of {type, steps[], motivation} - suggestions: array of {issue, improvement}
Currently supports hugging face transformers (more frameworks coming), we tested extensively with Qwen2.5 (0.5B-7B) models, but should work with most HF LLMs.
Installation is simple: `pip install git+https://github.com/klara-research/klarity.git`
We are building OS interpretability/explainability tools to visualize & analyse attention maps, saliency maps etc. and we want to understand your pain points with LLM behaviors. What insights would actually help you debug these black box systems?
Links:
- Repo: https://github.com/klara-research/klarity - Our website: [https://klaralabs.com](https://klaralabs.com/)
deoxykev|1 year ago
This creates a gap between the mechanical measurement of certainty and true understanding, much like mistaking the map for the territory or confusing the finger pointing at the moon with the moon itself.
I've done some work before in this space, trying to come up with different useful measures from the logprobs, such as measuring shannon entropy over a sliding window, or even bzip compression ratio as a proxy for information density. But I didn't find anything semantically useful or reliable to exploit.
The best approach I found was just multiple choice questions. "Does X entail Y? Please output [A] True or [B] False. Then measure the linprobs of the next token, which should be `[A` (90%) or `[B` (10%). Then we might make a statement like: The LLM thinks there is a 90% probability that X entails Y.
activatedgeek|1 year ago
In our paper [1], we find that asking a follow up question like "Is the answer correct?" and taking the normalized probability of "Yes" or "No" token (or more generally any such token trained for) seems to be best bet so far to get well-calibrated probabilities out of the model.
In general, the log-probability of tokens is not a good indicator of anything other than satisfying the pre-training loss function of predicting the "next token." (it likely is very well-calibrated on that task though) Semantics of language are a much less tamable object, especially when we don't quite have a good way to estimate a normalizing constant because every answer can be paraphrased in many ways and still be correct. The volume of correct answers in the generation space of language model is just too small.
There is work that shows one way to approximate the normalizing constant via SMC [2], but I believe we are more likely to benefit from having a verifier at train-time than any other approach.
And there are stop-gap solutions to make log probabilities more reliable by only computing them on "relevant" tokens, e.g. only final numerical answer tokens for a math problem [3]. But this approach kind of side-steps the problem of actually trying to find relevant tokens. Perhaps something more in the spirit of System 2 attention which selects meaningful tokens for the generated output would be more promising [4].
[1]: https://arxiv.org/abs/2406.08391 [2]: https://arxiv.org/abs/2404.17546 [3]: https://arxiv.org/abs/2402.10200 [4]: https://arxiv.org/abs/2311.11829
mrciffa|1 year ago
canjobear|1 year ago
codelion|1 year ago
mrciffa|1 year ago
siliconc0w|1 year ago
mrciffa|1 year ago
Folcon|1 year ago
MIT License. See LICENSE for more information.
But the LICENSE is Apache-2.0 license.
Which is it?
mrciffa|1 year ago
itssimon|1 year ago
andreakl|1 year ago
mrciffa|1 year ago
KTibow|1 year ago
mrciffa|1 year ago
kurisufag|1 year ago
mrciffa|1 year ago
thomastjeffery|1 year ago
I think most people clicking that button would be better served by scrolling down, but that's not made very obvious.
unknown|1 year ago
[deleted]