top | item 23971931

(no title)

vivekkalyan | 5 years ago

> We aren't sure, but the speculation is that in the process of training, GPT-3 found that the best strategy to correctly predicting the continuation of arithmetic expressions was to figure out the rules of basic arithmetic and encode them in some portion of its neural network, then apply them whenever the prompt suggested to do so.

I strongly disagree. GPT-3 has 100% accuracy on 2-digit addition, 80% on 3-digit addition, 25% on 4-digit addition and 9% on 5-digit addition. If it could indeed "understand arithmetic" the increase in number of digits should not affect its accuracy.

My perspective as an ML practitioner is that the cool part of GPT-3 is storing information effectively and it is able to decode queries easier than before to get the information that is required. Yet with things like arithmetic, the most efficient way would be to understand the rules of addition but the internal structure is too rigid to encode those rules atm.

discuss

order

No comments yet.