top | item 40271749

(no title)

blackle | 1 year ago

My understanding is that minimizing perplexity (what LLMs are generally optimized for) is equivalent to finding a good probably distribution over the next token.

discuss

No comments yet.