(no title)
steenreem | 1 year ago
I would think that the purpose of concepts is to capture information at a higher density than tokens, so you can remember a longer conversation or better produce long-form output.
Given that, I would have expected that during the training phase, the concept model is evaluated based on how few concepts it emits until it emits a stop.
No comments yet.