(no title)
wantsanagent | 1 year ago
This paper "compress sequence information into an anchor token" which is then used at inference time to reduce the information required for prediction as well as speed up that prediction. They do this via "continually pre-training the model to compress sequence information into the anchor token."
No comments yet.