They probably don’t. They’re very different. LLM’s seem to be based on pragmatic, mathematical techniques developed over time to produce patterns from data.
There’s at least three fields in this:
1. Machine learning using non-neurological techniques (most stuff). These use a combination of statistical algorithms stitched together with hyperparameter tweaking. Also, usually global optimization by heavy methods like backpropagation.
2. “Brain-inspired” or “biologically accurate”algorithms that try to imitate the brain. They sometimes include evidence their behavior matches experimental observations of brain behavior. Many of these use complex neurons, spiking nets, and/or local learning (Hebbian).
(Note: There is some work on hybrids such as integrating hippocampus-like memory or doing limited backpropagation on Hebbian-like architectures.)
3. Computational neuroscience which aims to make biologically-accurate models at various levels of granularity. Their goal is to understand brain function. A common reason is diagnosing and treating neurological disorders.
Making an LLM like the brain would require use of brain-inspired components, multiple systems specialized for certain tasks, memory integrated into all of them, and a brain-like model for reinforcement. Imitating God’s complex design is simply much more difficult than combining proven algorithms that work well enough. ;)
That said, I keep collecting work on both efficient ML and brain-inspired ML. I think some combination of the techniques might have high impact later. I think the lower, training costs of some brain-inspired methods, especially Hebbian learning, justify more experimentation by small teams with small, GPU budgets. Might find something cost-effective in that research. We need more of it on common platforms, too, like HughingFace libraries and cheap VM’s.
nickpsecurity|1 year ago
There’s at least three fields in this:
1. Machine learning using non-neurological techniques (most stuff). These use a combination of statistical algorithms stitched together with hyperparameter tweaking. Also, usually global optimization by heavy methods like backpropagation.
2. “Brain-inspired” or “biologically accurate”algorithms that try to imitate the brain. They sometimes include evidence their behavior matches experimental observations of brain behavior. Many of these use complex neurons, spiking nets, and/or local learning (Hebbian).
(Note: There is some work on hybrids such as integrating hippocampus-like memory or doing limited backpropagation on Hebbian-like architectures.)
3. Computational neuroscience which aims to make biologically-accurate models at various levels of granularity. Their goal is to understand brain function. A common reason is diagnosing and treating neurological disorders.
Making an LLM like the brain would require use of brain-inspired components, multiple systems specialized for certain tasks, memory integrated into all of them, and a brain-like model for reinforcement. Imitating God’s complex design is simply much more difficult than combining proven algorithms that work well enough. ;)
That said, I keep collecting work on both efficient ML and brain-inspired ML. I think some combination of the techniques might have high impact later. I think the lower, training costs of some brain-inspired methods, especially Hebbian learning, justify more experimentation by small teams with small, GPU budgets. Might find something cost-effective in that research. We need more of it on common platforms, too, like HughingFace libraries and cheap VM’s.
trhway|1 year ago
For the lower level - word embedings (word2vec, "King – Man + Woman = Queen") - one can see a similarity
https://www.nature.com/articles/d41586-019-00069-1 and https://gallantlab.org/viewer-huth-2016/
"The map reveals how language is spread throughout the cortex and across both hemispheres, showing groups of words clustered together by meaning."
nyrikki|1 year ago
Very different from a feed forward network with perceptrons, auttograd, etc...
Inner product spaces are fixed points, mapping between models is less surprising because the general case is a merger set IIRC.