top | item 40611185 (no title) opprobium | 1 year ago Not just efficiently, can't solve. discuss order hn newest logicchains|1 year ago They can solve it if you keep adding layers to the transformer, it's just not efficient; you'd need exponentially more layers than a similarly sized RNN.
logicchains|1 year ago They can solve it if you keep adding layers to the transformer, it's just not efficient; you'd need exponentially more layers than a similarly sized RNN.
logicchains|1 year ago