top | item 40611185

(no title)

opprobium | 1 year ago

Not just efficiently, can't solve.

discuss

order

logicchains|1 year ago

They can solve it if you keep adding layers to the transformer, it's just not efficient; you'd need exponentially more layers than a similarly sized RNN.