top | item 36376315 (no title) oneseven | 2 years ago It seems like learned positional encodings would still prevent you from doing fine tuning on a larger context size, though, so maybe using alibi is still relevant (although I have not read that paper). discuss order hn newest jimsimmons|2 years ago You can collapse all positions beyond a length to a specific bucket like T5
jimsimmons|2 years ago