This article seems to fall straight into the trap it aims to warn us about. All this talk about "true" understanding, embodiment, etc. is needless antropomorphizing.
A much better framework for thinking about intelligence is simply as the ability to make predictions about the world (including conditional ones like "what will happen if we take this action"). Whether it's achieved through "true understanding" (however you define it; I personally doubt you can) or "mimicking" bears no relevance for most of the questions about the impact of AI we are trying to answer.
It matters if your civilizational system is built on assigning rights or responsibilities to things because they have consciousness or "interiority." Intelligence fits here just as well.
Currently many of our legal systems are set up this way, if in a fairly arbitrary fashion. Consider for example how sentience is used as a metric for whether an animal ought to receive additional rights. Or how murder (which requires deliberate, conscious thought) is punished more harshly than manslaughter (which can be accidental or careless.)
If we just treat intelligence as a descriptive quality and apply it to LLMs, we quickly realize the absurdity of saying a chatbot is somehow equivalent, consciously, to a human being. At least, to me it seems absurd. And it indicates the flaws of grafting human consciousness onto machines without analyzing why.
"Making predictions about the world" is a reductive and childish way to describe intelligence in humans. Did David Lynch make Mulholland Drive because he predicted it would be a good movie?
The most depressing thing about AI summers is watching tech people cynically try to define intelligence downwards to excuse failures in current AI.
I think that intelligence requires, or rather, is the development and use of a model of the problem while the problem is being solved, i.e. it involves understanding the problem. Accurate predictions, based on extrapolations made by systems trained using huge quantities of data, are not enough.
Imagine LLM is conscious (as Anthropic wants us to believe). Imagine LLM is made to train on so much data which is far beyond what its parameter count allows for. Am I hurting the LLM by causing it intensive cognitive strain?
I've always had the feeling that AI researchers want to build their own human without having to change diapers being part of the process. Just skip to adulthood please, and learn to drive a car without having experience in bumping into things and hurting yourself.
> Language doesn't just describe reality; it creates it.
I wonder if this is a statement from the discussed paper or from the blog author. Haven't found the original paper yet, but this blog post very much makes me want to read it.
> I've always had the feeling that AI researchers want to build their own human without having to change diapers being part of the process. Just skip to adulthood please, and learn to drive a car without having experience in bumping into things and hurting yourself.
I partially agree, but the idea about AI is that you need to bump into things and hurt yourself only once. Then you have a good driver you can replicate at will
> But that still leaves a crucial question: can we develop a more precise, less anthropomorphic vocabulary to describe AI capabilities? Or is our human-centric language the only tool we have to reason about these new forms of intelligence, with all the baggage that entails?
I don't get the problem with this really. I think LLM's "reasoning" is a very fair and proper way to call it. It takes time and spits out tokens that it recursively uses to get a much better output than it otherwise would have. Is it actually really reasoning using a brain like a human would? No. But it is close enough so I don't see the problem calling it "reasoning". What's the fuss about?
Are swimming and sailing the same, because they both have the result of moving through the water?
I'd say, no, they aren't, and there is value in understanding the different processes (and labeling them as such), even if they have outputs that look similar/identical.
It has absolutely nothing to do with reasoning, and I don't understand how anyone could think it's"close enough".
Reasoning models are simply answering the same question twice with a different system prompt. It's a normal LLM with an extra technical step. Nothing else.
The problem is fuzzy language can make debate poor and about the definition of words rather than about reality. The answer I think it to avoid that and find things that you can be clear about. A famous example is the Turing test. Rather than debates on whether machines can think getting bogged down in endless variation of how people define thinking, Turing looked at if the machines could be told apart from humans which he discussed in his paper.
I would add a fifth fallacy: assuming what we humans do can be reduced to “intelligence”. We are actually very irrational. Humans are driven strongly by Will, Desire, Love, Faith, and many other irrational traits. Has an LLM ever demonstrated irrational love? Or sexual desire? How can it possibly do what humans do without these?
Yeah I think that's an important dimension. David Hume said that there was no action without passion and I think that's a key difference with AIs. They sit there passive until we interact with them. They dont want anything, they dont have goals, desires, motivations. The emotional part of the human psyche does a lot of work - we aren't just calculating sums
For all its advanced capabilities, the LLM remains a glorified natural language interface. It is exceptionally good at conversational communication and synthesizing existing knowledge, making information more accessible and in some cases, easier to interact with. However, many of the more ambitious applications, such as so-called "agents," are not a sign of nascent intelligence. They are simply sophisticated workflows—complex combinations of Python scripts and chained API calls that leverage the LLM as a sub-routine. These systems are clever, but they are not a leap towards true artificial agency. We must be cautious not to confuse a powerful statistical tool with the dawn of genuine machine consciousness.
> The primary counterargument can be framed in terms of Rich Sutton's famous essay, "The Bitter Lesson," which argues that the entire history of AI has taught us that attempts to build in human-like cognitive structures (like embodiment) are always eventually outperformed by general methods that just leverage massive-scale computation
This reminds me Douglas Hofstadter, of the Gödel, Escher, Bach fame. He rejected all of this statistical approaches towards creating intelligence and dug deep into the workings of human mind [1]. Often, in the most eccentric ways possible.
> ... he has bookshelves full of these notebooks. He pulls one down—it’s from the late 1950s. It’s full of speech errors. Ever since he was a teenager, he has captured some 10,000 examples of swapped syllables (“hypodeemic nerdle”), malapropisms (“runs the gambit”), “malaphors” (“easy-go-lucky”), and so on, about half of them committed by Hofstadter himself.
>
> For Hofstadter, they’re clues. “Nobody is a very reliable guide concerning activities in their mind that are, by definition, subconscious,” he once wrote. “This is what makes vast collections of errors so important. In an isolated error, the mechanisms involved yield only slight traces of themselves; however, in a large collection, vast numbers of such slight traces exist, collectively adding up to strong evidence for (and against) particular mechanisms.”
I don't know when, where, and how the next leap in AGI will come through, but it's just very likely, it will be through brute-force computation (unfortunately). So much for fifty years of observing Freudian slips.
>...the most important fallacy. It's the deep-seated assumption that intelligence is, like software, a form of pure information processing that can be separated from its body.
I think he gets into a muddle on that one. If something online can provide smarter thinking and answers to questions than I can then I figure it's intelligent and it doesn't matter if it's an LLM, a human or a disembodied spirit that somehow happens to be online.
He kind of gets that from human minds not being disembodied from their brains but that's a different thing.
> The primary counterargument can be framed in terms of Rich Sutton's famous essay, "The Bitter Lesson," which argues that the entire history of AI has taught us that attempts to build in human-like cognitive structures (like embodiment) are always eventually outperformed by general methods that just leverage massive-scale computation.
That's not what it says, but that hand-made heuristics are defeated by general methods. There is no reason why the same methods should not perform even better when informed by data through interacting with the world.
> Mitchell in her paper compares modern AI to alchemy. It produces dazzling, impressive results but it often lacks a deep, foundational theory of intelligence.
> It’s a powerful metaphor, but I think a more pragmatic conclusion is slightly different. The challenge isn't to abandon our powerful alchemy in search of a pure science of intelligence.
But alchemy was wrong and chasing after the illusions created by the frauds who promoted alchemy held back the advancement of science for a long time.
We absolutely should have abandoned alchemy as soon as we saw that it didn't work, and moved to figuring out the science of what worked.
I think the Stochastic Parrots idea is pretty outdated and incorrect. LLMs are not parrots, we don't even need them to parrot, we already have perfect copying machines. LLMs are working on new things, that is their purpose, reproducing the same thing we already have is not worth it.
The core misconception here is that LLMs are autonomous agents parroting away. No, they are connected to humans, tools, reference data, and validation systems. They are in a dialogue, and in a dialogue you quickly get into a place where nobody has ever been before. Take any 10 consecutive words from a human or LLM and chances are nobody on the internet stringed those words the same way before.
LLMs are more like pianos than parrots, or better yet, like another musician jamming together with you, creating something together that none would do individually. We play our prompts on the keyboard and they play their "music" back to us. Good or bad - depends on the player at the keyboard, they retain most control. To say LLMs are Stochastic Parrots is to discount the contribution of the human using it.
Related to intelligence, I think we have a misconception that it comes from the brain. No, it comes from the feedback loop between brain and environment. The environment plays a huge role in exploration, learning, testing ideas, and discovery. The social aspect also plays a big role, parallelizing exploration and streamlining exploitation of discoveries. We are not individually intelligent, it is a social, environment based process, not a pure-brain process.
Searching for intelligence in the brain is like searching for art in the paint pigments and canvas cloth.
I think you are on to something. Chasing AGI is - I believe - ultimately useless endeavour, but we can already use the existing tools we have in ingenious and creative ways. And no I don’t mean endless barrage of AI lofi hip hop or the same ”cool” album cover with random kanji that all of them have. For instance, it is pretty amazing to have a private tutor which with you can discuss why Charles XII of Sweden ultimately failed in his war against Russia or why roughly 30% of people seems to have a personality that leans toward authoritanianism - this is how people have learned since the very beginning of language. But conversation is an art and you get out from it what you bring into it. It also does not give you a readymade result which you can immediatedly capitalise on, which is what investors want, but what could and can ultimately be useful to humanity.
However, almost all models (worst is ChatGPT) are made virtually useless in this respect, since they are basically sycophantic yesmen - why on earth does an ”autocorrect on steroids” pretend to laugh at my jokes?
Next step is not to built faster models or throw more computing power at them, bit to learn to play the piano.
The fact that it can copy smartly exactly ONE of the information in a given prompt (which is a complex sentence only humans could process before) and not others is absolutely a progress in computer science, and very useful. I’m still amazed by that everyday, I never thought I’d see an algorithm like that in my lifetime. (Calling it parroting is of course pejorative)
You can shuffle a deck of 52 cards, and be reasonably confident that nobody has ever gotten that exact shuffle (or probably ever will, until the universe dies). But at least in this case, we are sure that a deck of 52 cards can be arranged in any permutation of 52 cards. We know we can reach any state from any other state.
This is not the case for LLMs. We don't know what the full state space looks like. Just because the state space that LLMs (lossily) compress, is unimaginably huge, doesn't mean that you can assume that the state you want is one of them. So yeah, you might get a string of symbols that nobody has seen before, but you still have no way of knowing whether A) it's the string of symbols you wanted, and B) if it isn't, whether the string of symbols you wanted can ever be generated by the network at all.
Not the author, but to extend this quote from the article:
> Its [Large Language Models] ability to write code and summarize text feels like a qualitative leap in generality that the monkey-and-moon analogy doesn't quite capture. This leaves us with a forward-looking question: How do recent advances in multimodality and agentic AI test the boundaries of this fallacy? Does a model that can see and act begin to bridge the gap toward common sense, or is it just a more sophisticated version of the same narrow intelligence? Are world models a true step towards AGI or just a higher branch in a tree of narrow linguistic intelligence?
I'd put the expression common sense on the same level as having causal connections, and would also assume that SOTA LLMs do not create an understanding based on causality. AFAICS this is known as the "reversal curse"[0].
I let them know today — when i laid on my horn while passing a Waymo stopped at a green light blocking the left turn lane — with its right blinker on.
Re: Tesla, this company paid me nearly $250,000 under multiple lemon law claims for their “self driving” software issues i identified that affected safety.
We all know what happened with Cruise, which was after i declared myself constructively dismissed.
I think the characterization in the article is fair, “self driving” is not quite there yet.
They know. There’s a big difference being able to navigate the 80% of everyday driving situations and doing the 20% most people manage just fine but cars struggle with. There’s a road in these parts: narrow, twisty in three dimensions, unmarked, trees close to the road. Gets jolly slippery in the winter. I can drive that road in the middle of the night in sleet. Can an autonomous car?
Part of the points of fallacies one and four is that a human can get out of the car and walk into work as a CPA or whatever, while even the autonomous-ish offerings of Waymo et al don’t necessarily advance the ball on other domains
I don't understand why people can't handle metaphors to explain things in AI so much.
The same terms exist in other fields. Physics has things that want to go to a lower energy level, the ball wants to fall but the table is holding it up. Electrons don't like being near each other, The hugs boson puts on little bunny ears and goes around giving mass to all the other good particles.
None of these are said in any way as a suggestion that these things have any form of intention.
They also don't in AI. When scientists really think those abilities are there in a provable way (or even if they suspect), I can assure you that they will be prepared to make it crystal clear that this is what they are claiming. Critisising the use of metaphor is kind of a pre-emptive attack against claims that might be made in the future.
Some AI scientists believe that there is a degree of awareness in recent models. They may be right or wrong but the ones who believe this are outright saying so.
I'm also inclined, if you'll excuse the term, to be critical of anything suggesting the assumption of smooth progress when they declare something to be the first step. Steps are not smooth. That's a good example of ignoring the what of the metaphor.
I don't really know what to make about the embodiment position, it feels like it's trying to hide dualism behind a practical limitation. Once you start drilling down into the why/why not and what do you mean by that, I wouldn't be at all surprised to see the expectation that you can't train an AI because it doesn't have a soul
A lot of people really, really don't want LLMs to be "actually intelligent", so they oppose any use of any remotely "anthropomorphic" terms in application to LLMs on that principle alone.
IMO, anthropomorphizing LLMs is at least directionally correct in 9 cases out of 10.
It’s true that much of the debate around AI swings between extremes — utopian promises on one side, dystopian collapse on the other. But institutions don’t operate well in extremes.
What matters is how we design governance that acknowledges uncertainty while still enabling progress. In practice, that means imperfect but adaptive frameworks — guardrails that evolve as technology and society evolve.
Instead of asking “which fallacy is right,” we might ask: how do we build systems that remain trustworthy even when our assumptions about AI turn out to be wrong?
entropyneur|5 months ago
A much better framework for thinking about intelligence is simply as the ability to make predictions about the world (including conditional ones like "what will happen if we take this action"). Whether it's achieved through "true understanding" (however you define it; I personally doubt you can) or "mimicking" bears no relevance for most of the questions about the impact of AI we are trying to answer.
keiferski|5 months ago
Currently many of our legal systems are set up this way, if in a fairly arbitrary fashion. Consider for example how sentience is used as a metric for whether an animal ought to receive additional rights. Or how murder (which requires deliberate, conscious thought) is punished more harshly than manslaughter (which can be accidental or careless.)
If we just treat intelligence as a descriptive quality and apply it to LLMs, we quickly realize the absurdity of saying a chatbot is somehow equivalent, consciously, to a human being. At least, to me it seems absurd. And it indicates the flaws of grafting human consciousness onto machines without analyzing why.
AIPedant|5 months ago
The most depressing thing about AI summers is watching tech people cynically try to define intelligence downwards to excuse failures in current AI.
DonaldFisk|5 months ago
ACCount37|5 months ago
The only real and measurable thing is performance. And the performance of AI systems only goes up.
cantor_S_drug|5 months ago
wagwang|5 months ago
theturtlemoves|5 months ago
> Language doesn't just describe reality; it creates it.
I wonder if this is a statement from the discussed paper or from the blog author. Haven't found the original paper yet, but this blog post very much makes me want to read it.
ta20240528|5 months ago
I never under stand these kinds of statements.
Does the sun not exist until we have a word for it, did "under the rock" not exist for dinosaurs?
sharikous|5 months ago
I partially agree, but the idea about AI is that you need to bump into things and hurt yourself only once. Then you have a good driver you can replicate at will
degamad|5 months ago
Melanie Mitchell (2021) "Why AI is Harder Than We Think." https://arxiv.org/abs/2104.12871
That sentence is not from this paper.
namro|5 months ago
simianwords|5 months ago
I don't get the problem with this really. I think LLM's "reasoning" is a very fair and proper way to call it. It takes time and spits out tokens that it recursively uses to get a much better output than it otherwise would have. Is it actually really reasoning using a brain like a human would? No. But it is close enough so I don't see the problem calling it "reasoning". What's the fuss about?
keiferski|5 months ago
I'd say, no, they aren't, and there is value in understanding the different processes (and labeling them as such), even if they have outputs that look similar/identical.
iLoveOncall|5 months ago
Reasoning models are simply answering the same question twice with a different system prompt. It's a normal LLM with an extra technical step. Nothing else.
tim333|5 months ago
draw_down|5 months ago
[deleted]
_1tem|5 months ago
peterashford|5 months ago
alwinaugustin|5 months ago
shubhamjain|5 months ago
This reminds me Douglas Hofstadter, of the Gödel, Escher, Bach fame. He rejected all of this statistical approaches towards creating intelligence and dug deep into the workings of human mind [1]. Often, in the most eccentric ways possible.
> ... he has bookshelves full of these notebooks. He pulls one down—it’s from the late 1950s. It’s full of speech errors. Ever since he was a teenager, he has captured some 10,000 examples of swapped syllables (“hypodeemic nerdle”), malapropisms (“runs the gambit”), “malaphors” (“easy-go-lucky”), and so on, about half of them committed by Hofstadter himself.
>
> For Hofstadter, they’re clues. “Nobody is a very reliable guide concerning activities in their mind that are, by definition, subconscious,” he once wrote. “This is what makes vast collections of errors so important. In an isolated error, the mechanisms involved yield only slight traces of themselves; however, in a large collection, vast numbers of such slight traces exist, collectively adding up to strong evidence for (and against) particular mechanisms.”
I don't know when, where, and how the next leap in AGI will come through, but it's just very likely, it will be through brute-force computation (unfortunately). So much for fifty years of observing Freudian slips.
[1]: https://www.theatlantic.com/magazine/archive/2013/11/the-man...
CuriouslyC|5 months ago
ggm|5 months ago
tim333|5 months ago
I think he gets into a muddle on that one. If something online can provide smarter thinking and answers to questions than I can then I figure it's intelligent and it doesn't matter if it's an LLM, a human or a disembodied spirit that somehow happens to be online.
He kind of gets that from human minds not being disembodied from their brains but that's a different thing.
esafak|5 months ago
That's not what it says, but that hand-made heuristics are defeated by general methods. There is no reason why the same methods should not perform even better when informed by data through interacting with the world.
justlikereddit|5 months ago
That is also a fallacy from being too immersed in a professional environment filled with deep reasoning and a deep rooted tradition of logic.
In the greater human civilization you will find an abundance of individuals lacking both reasoning and common sense.
jokoon|5 months ago
degamad|5 months ago
> Mitchell in her paper compares modern AI to alchemy. It produces dazzling, impressive results but it often lacks a deep, foundational theory of intelligence.
> It’s a powerful metaphor, but I think a more pragmatic conclusion is slightly different. The challenge isn't to abandon our powerful alchemy in search of a pure science of intelligence.
But alchemy was wrong and chasing after the illusions created by the frauds who promoted alchemy held back the advancement of science for a long time.
We absolutely should have abandoned alchemy as soon as we saw that it didn't work, and moved to figuring out the science of what worked.
tonyhart7|5 months ago
they don't need to reach equal human intelligence, the just need to reach an acceptable of intelligence so corporation can reduce labor cost
sure it bad at certain things but you know what ??? most of real world job didn't need a genius either
visarga|5 months ago
The core misconception here is that LLMs are autonomous agents parroting away. No, they are connected to humans, tools, reference data, and validation systems. They are in a dialogue, and in a dialogue you quickly get into a place where nobody has ever been before. Take any 10 consecutive words from a human or LLM and chances are nobody on the internet stringed those words the same way before.
LLMs are more like pianos than parrots, or better yet, like another musician jamming together with you, creating something together that none would do individually. We play our prompts on the keyboard and they play their "music" back to us. Good or bad - depends on the player at the keyboard, they retain most control. To say LLMs are Stochastic Parrots is to discount the contribution of the human using it.
Related to intelligence, I think we have a misconception that it comes from the brain. No, it comes from the feedback loop between brain and environment. The environment plays a huge role in exploration, learning, testing ideas, and discovery. The social aspect also plays a big role, parallelizing exploration and streamlining exploitation of discoveries. We are not individually intelligent, it is a social, environment based process, not a pure-brain process.
Searching for intelligence in the brain is like searching for art in the paint pigments and canvas cloth.
delis-thumbs-7e|5 months ago
However, almost all models (worst is ChatGPT) are made virtually useless in this respect, since they are basically sycophantic yesmen - why on earth does an ”autocorrect on steroids” pretend to laugh at my jokes?
Next step is not to built faster models or throw more computing power at them, bit to learn to play the piano.
ttoinou|5 months ago
vrighter|5 months ago
This is not the case for LLMs. We don't know what the full state space looks like. Just because the state space that LLMs (lossily) compress, is unimaginably huge, doesn't mean that you can assume that the state you want is one of them. So yeah, you might get a string of symbols that nobody has seen before, but you still have no way of knowing whether A) it's the string of symbols you wanted, and B) if it isn't, whether the string of symbols you wanted can ever be generated by the network at all.
adastra22|5 months ago
Question for the author: how are SOTA LLM models not common sense machines?
whilenot-dev|5 months ago
> Its [Large Language Models] ability to write code and summarize text feels like a qualitative leap in generality that the monkey-and-moon analogy doesn't quite capture. This leaves us with a forward-looking question: How do recent advances in multimodality and agentic AI test the boundaries of this fallacy? Does a model that can see and act begin to bridge the gap toward common sense, or is it just a more sophisticated version of the same narrow intelligence? Are world models a true step towards AGI or just a higher branch in a tree of narrow linguistic intelligence?
I'd put the expression common sense on the same level as having causal connections, and would also assume that SOTA LLMs do not create an understanding based on causality. AFAICS this is known as the "reversal curse"[0].
[0]: https://youtu.be/zjkBMFhNj_g?t=750
warkdarrior|5 months ago
Someone should let Waymo, Zoox, Pony.ai, Apollo Go, and even Tesla know!
joshribakoff|5 months ago
Re: Tesla, this company paid me nearly $250,000 under multiple lemon law claims for their “self driving” software issues i identified that affected safety.
We all know what happened with Cruise, which was after i declared myself constructively dismissed.
I think the characterization in the article is fair, “self driving” is not quite there yet.
belZaah|5 months ago
kristjansson|5 months ago
kortilla|5 months ago
samtp|5 months ago
another_twist|5 months ago
I honestly didnt understand the arguments. Could someone TLDR please ?
retrocog|5 months ago
Lerc|5 months ago
The same terms exist in other fields. Physics has things that want to go to a lower energy level, the ball wants to fall but the table is holding it up. Electrons don't like being near each other, The hugs boson puts on little bunny ears and goes around giving mass to all the other good particles.
None of these are said in any way as a suggestion that these things have any form of intention.
They also don't in AI. When scientists really think those abilities are there in a provable way (or even if they suspect), I can assure you that they will be prepared to make it crystal clear that this is what they are claiming. Critisising the use of metaphor is kind of a pre-emptive attack against claims that might be made in the future.
Some AI scientists believe that there is a degree of awareness in recent models. They may be right or wrong but the ones who believe this are outright saying so.
I'm also inclined, if you'll excuse the term, to be critical of anything suggesting the assumption of smooth progress when they declare something to be the first step. Steps are not smooth. That's a good example of ignoring the what of the metaphor.
I don't really know what to make about the embodiment position, it feels like it's trying to hide dualism behind a practical limitation. Once you start drilling down into the why/why not and what do you mean by that, I wouldn't be at all surprised to see the expectation that you can't train an AI because it doesn't have a soul
I agree with xkcd 1425 though.
ACCount37|5 months ago
A lot of people really, really don't want LLMs to be "actually intelligent", so they oppose any use of any remotely "anthropomorphic" terms in application to LLMs on that principle alone.
IMO, anthropomorphizing LLMs is at least directionally correct in 9 cases out of 10.
PolicyPhantom|5 months ago
What matters is how we design governance that acknowledges uncertainty while still enabling progress. In practice, that means imperfect but adaptive frameworks — guardrails that evolve as technology and society evolve.
Instead of asking “which fallacy is right,” we might ask: how do we build systems that remain trustworthy even when our assumptions about AI turn out to be wrong?
chromanoid|5 months ago
grantcas|5 months ago
[deleted]
renewiltord|5 months ago