top | item 47201181

(no title)

pankajdoharey | 1 day ago

Looks like a Tiny Analytic transformer, RNN is arguably a better choice if you are gonna handwire an architecture to mechanically do addition. Learning is about discovering the patterns and algorithm from data. Wiring a machine to follow a procedure defeats that purpose.

discuss

order

dnautics|1 day ago

it proves that the algorithm is embeddable in a bigger transformer of ~similar architecture.