top | item 42551979

(no title)

ychen306 | 1 year ago

How this works is the LLM predicts the probability of the next token and then an arithmetic coder turns that probability distribution into bits. So it will never hallucinate. In the worst case, when the LLM makes an outrageous prediction, you just use more bits, but it doesn't affect correctness.

discuss

No comments yet.