top | item 42558374

(no title)

Most foxes are herbivores according to the fellow guarding my henhouse.

discuss

redlock|1 year ago

They do make compelling arguments if you listened to them. Also, from my coding with llms using cursor they obviously understand the code and my request more often than not. Mechanistic interpretability research shows evidence of concepts being represented within the layers. Golden gate claude is evidence of this: https://www.anthropic.com/news/golden-gate-claude

To me this proves that llms learn concepts and multilayeres representations within its network and not just some dumb statistical inference. Even famous llm skeptic like Francois Chollet doesn't invoke this stochastic parrots anymore and has moved on to arguing that they don't generalize well and are just memorizing.

With GPT-2 and 3 I was of the same opinion that they seemed like just sophisticated stochastic parrots, but current llms are a different class from early gpts. Now that o3 has beaten memory resisting benchmark like ARC-AGI I think we can confidently move on from this stochastic parrots notion.

(And before you argue that o3 is not an llm, here is an openai researcher stating that it is an llm https://x.com/__nmca__/status/1870170101091008860?s=46&t=eTe... )

pixelfarmer|1 year ago

Neural networks are generalizing things as part of their optimization scheme. The current approach is just to dump many layered neural networks (at the core) as in "deep learning" to solve the problems, but the networks are too regular, too "primitive" of sorts. What is needed are network topologies that strongly support this generalization, the creation of "meta abstraction levels", otherwise it will get nowhere.

Biological networks of more intelligent species contain a few billion neurons and upwards from that while even the big LLMs are somewhere in the millions of "equivalent" at best. So, bad topology + much less "neurons" and the resulting capabilities shouldn't be too surprising. Plus it is clear that AGI is nowhere close, because one result of AGI is a proper understanding of "I". Crows have an understanding of "I", for example.

And that is where these "meta abstraction levels" come in: There are many needed to eventually reach the stage of "I". This can also be used to test how well neural networks perform, how far they can abstract things for real, how many levels of generalization are handled by it. But therein lies a problem: Let 2 persons specify abstraction levels and the results will be all across the board. This is also why ARC-AGI, while dives into that, cannot really solve the problem of testing "AI", let alone "AGI": We as humans are currently unable to properly test intelligence in any meaningful way. All the tests we have are mere glimpses into it and often (complex) multivariable + multi(abstraction)layered tests and dealing with the results, consequently, a total mess, even if we throw big formulas at it.