top | item 41164026

(no title)

zygy | 1 year ago

Naive question: what's the intuition for how this is different from increasing the number of learnable parameters on a regular MLP?

discuss

slashdave|1 year ago

Orthogonality ensures that each weight has its own, individual importance. In a regular MLP, the weights are naturally correlated.