top | item 36066037

(no title)

joppy | 2 years ago

The compare-how-big-a-lookup-table-is argument is a bit of a red herring for comparing how complex things are. For example, a 3x3 matrix implements a map from 3 floats to another three floats, a huge space of possibilities (if we have 4-byte floats, this function space has (2^96)^(2^96) elements). From this perspective, representing that map as 9 numbers is an amazing compression ratio. But surely one cannot argue that matrices “have more going on” than arbitrary functions.

discuss

order

jbay808|2 years ago

I would interpret this as showing that matrix multiplication code is carefully engineered to correctly implement... well, matrix multiplication. Stumbling on that specific mapping of 96 input bits to 96 output bits would be hard to pick out of a hat by chance, from the set of all possible mappings. Learning that precise mapping, starting from a uniform prior and only given a finite set of examples, could be seen as an impressive task, although less impressive than sorting. If a model learns the correct mapping -- and better yet, needs only 9 parameters to implement it -- then I think it's fairer to say the model does matrix multiplication, rather than that the model convincingly imitates the statistics of matrix multiplication.

ikiris|2 years ago

This is kind of like saying a transistor can't make a decision, all it does is pass electrons or not based on inputs.

The decisions happen because of how they're wired.