top | item 40058866

(no title)

euclaise | 1 year ago

This one does have attention, it's just chunked into segments of 4096

discuss

order

cs702|1 year ago

Yes, but the claim is about "unlimited context length." I doubt attention over each segment can be as good at recall as attention over the full input context.