WingNews logo WingNews
top | new | best | ask | show | jobs
top | item 35911442

(no title)

rewq4321 | 2 years ago

> Also the attention mechanism is baked in during pretraining

IIUC, this is no longer necessarily true with positional encodings like ALiBi: https://github.com/ofirpress/attention_with_linear_biases

discuss

order

No comments yet.

powered by hn/api // news.ycombinator.com