top | item 47163184

(no title)

dnautics | 3 days ago

> pattern matches the "idea shape" of words in the "idea space

it does much more than this. first layer has an attention mechanism on all previous tokens and spits out an activation representing some sum of all relations between the tokens. then the next layer spits out an activation representing relations of relations, and the next layer and so forth. the llm is capable of deducing a hierarchy of structural information embedded in the text.

not clear to me how this isn't "understanding".

discuss

No comments yet.