top | item 36315347

(no title)

Nobody knows how the LLMs work under the hood. It's just lots of stacked transformers that encode various concepts. Nothing in this book refutes whether Chomsky's concepts are actually being encoded in LLMs or not. For all we know, Chomsky's concept of "binding principles", "binary branching" etc could be represented inside the inner layers of these many billion parameter models. In fact, I'd argue that this is the right research to do. Prove that no transform or feed-forward layer inside the neural net encodes, say "binding principles".

discuss

michaelhartm|2 years ago

Btw, semantics and syntax is separated in the LLMs (the author is wrong). The embedding function (matmul) can map syntax and the proximity in the embedding (e.g. cosine similarity) is the semantics (that's attention). So not convinced. Chomsky might be wrong or right, but this author hasn't proven it.