top | item 41115918

(no title)

I actually made a video diving deeper into this and comparing responses from people and ChatGPT for a creative thinking problem - https://youtu.be/l-9EUBbktqw

discuss

somenameforme|1 year ago

I think your hypothesis here (and probably the entire article as well) is strongly challenged by the 'progenitor argument.' Take humans at the dawn of humanity. Language did not even exist beyond what may have been crude sounds or gesturing and collective knowledge did not fall that far beyond 'poke him with the pointy side.' Somehow we went from that to putting a man on the Moon in what was essentially the blink of an eye.

Training an LLM on the entirety of knowledge at this dawn of humanity and, even if you give it literally infinite training time, it's never going to go anywhere. It's going to just continue making relatively simple recombinations of its training set until somebody gives it a new training set to remix. This remix-only nature is no different with modern knowledge, but simply extremely obfuscated because there's such a massive base of information, and nobody is aware of anything more than a minuscule fraction of it all.

---

As for the 'secret' of LLMs, I think it's largely that most language is extremely redundant. One thought or point naturally flows.... why do I complete the rest of this statement? You already know exactly what I'm going to say, right? And from that statement the rest of my argument will also mostly write itself. Yet we do write out the rest, which is kind of weird if you think about it. Anyhow the point is that by looking at language 'flow correlations' over huge samples, LLMs can reconstruct and remix arbitrarily long dialogue from even the shortest of initial inputs. And it usually sounds at least reasonable, except when it doesn't and we call it a hallucination, but it's quite a misnomer because the entire process is a hallucination.

ArcaneMoose|1 year ago

Interesting point - thanks for sharing! I think one big missing piece we have with AIs today is the ability for them to learn on the fly and reconfigure the weights. We are constantly bombarded with input and our neurons adjust accordingly. Current LLMs just use a snapshot. I would be really curious to see how online-first AI models could work, focusing on a constant input stream and iterating on weights. Also I wonder how much knowledge is baked into our DNA through evolution. I have a hunch that this is somewhat analogous to model architectures.

Btw - although I see evidence of LLMs creating "new ideas" through combinations of ideas, I am a bit mystified by their apparent reasoning issues. I wonder how that is different in nature from the memory-based approach. ARC-AGI benchmark has had me thinking about this for sure.