top | item 42938115

(no title)

mattjjatgoogle | 1 year ago

An author's tweet thread: https://x.com/jacobaustin132/status/1886844716446007300

discuss

order

awongh|1 year ago

Here in the thread he says: https://x.com/jacobaustin132/status/1886844724339675340 : `5 years ago, there were many ML architectures, but today, there is (mostly) only one [transformers].`

To what degree is this actually true, and what else is on the horizon that might become as popular as transformers?

swyx|1 year ago

it's quite true. the convergence of all archs to transformers is well documented by karpathy. SSMs were once touted as transformer killers, but increasingly look like just optional supplements.