top | item 33942139

(no title)

miltondts | 3 years ago

What I don't understand is where is the memory? How does GPT-3 or ChatGPT remember so much information with just that architecture? It would seem that the maximum it could remember is 2048 words.

EDIT: Maybe it's 2048 x 96? Still seems low for what it can do.

discuss

order

mjburgess|3 years ago

300bn weights, at 4bytes/weight is 1.2TB

Epa095|3 years ago

Yes, but how does it remember the stuff you told it earlier in the conversation? Those 1.2TB is the trained model, and I assume that those weights are not changed by the conversation?