top | item 46187672

(no title)

bloaf | 2 months ago

Everyone is out here acting like "predicting the next thing" is somehow fundamentally irrelevant to "human thinking" and it is simply not the case.

What does it mean to say that we humans act with intent? It means that we have some expectation or prediction about how our actions will effect the next thing, and choose our actions based on how much we like that effect. The ability to predict is fundamental to our ability to act intentionally.

So in my mind: even if you grant all the AI-naysayer's complaints about how LLMs aren't "actually" thinking, you can still believe that they will end up being a component in a system which actually "does" think.

discuss

RayVR|2 months ago

Are you a stream of words or are your words the “simplistic” projection of your abstract thoughts? I don’t at all discount the importance of language in so many things, but the question that matters is whether statistical models of language can ever “learn” abstract thought, or become part of a system which uses them as a tool.

My personal assessment is that LLMs can do neither.

ACCount37|2 months ago

Words are the "simplistic" projection of an LLM's abstract thoughts.

An LLM has: words in its input plane, words in its output plane, and A LOT of cross-linked internals between the two.

Those internals aren't "words" at all - and it's where most of the "action" happens. It's how LLMs can do things like translate from language to language, or recall knowledge they only encountered in English in the training data while speaking German.

balamatom|2 months ago

I'm definitely a stream of words.

My "abstract thoughts" are a stream of words too, they just don't get sounded out.

Tbf I'd rather they weren't there in the first place.

But bodies which refuse to harbor an "interiority" are fast-tracked to destruction because they can't suf^W^W^W be productive.

Funny movie scene from somewhere. The sergeant is drilling the troops: "You, private! What do you live for!", and expects an answer along the lines of dying for one's nation or some shit. Instead, the soldier replies: "Well, to see what happens next!"

Davidzheng|2 months ago

Even if they are "simplistic projections", which I don't think is the correct way to think about it, there's no reason that more LLM thoughts in middle layers can't also exist and project down at the end. Though there might be efficency issues because the latent thoughts have to be recomputed a lot.

Though I do think in human brains it's also an interplay where what we write/say also loops back into the thinking as well. Which is something which is efficient for LLMs.

gardenhedge|2 months ago

I am a stream of words - I have even ran out of tokens while speaking before :)

But raising kids, I can clearly see that intelligence isn't just solved by LLMs

akoboldfrying|2 months ago

LLMs and human brains are both just mechanisms. Why would one mechanism a priori be capable of "learning abstract thought", but no others?

If it turns out that LLMs don't model human brains well enough to qualify as "learning abstract thought" the way humans do, some future technology will do so. Human brains aren't magic, special or different.

MyOutfitIsVague|2 months ago

> Everyone is out here acting like "predicting the next thing" is somehow fundamentally irrelevant to "human thinking" and it is simply not the case.

Nobody is. What people are doing is claiming that "predicting the next thing" does not define the entirety of human thinking, and something that is ONLY predicting the next thing is not, fundamentally, thinking.

visarga|2 months ago

Well, yes because thinking soon requires interacting, not just ideating. It's in the dialogue between ideation and interaction that we make our discoveries.

agumonkey|2 months ago

when LLM popped out and people started to say 'this is just markov chain on steroid and not thinking' i was a bit confused because a lot of my "thinking" is statistical too.. I very often try to solve an issue by switching a known solution with a different "probable" variant of it (tweaking a parameter)

LLMs have higher dimensions (they map token to grammatical and semantical space) .. it might not be thinking but it seems on its way we're just thinking with more abstractions before producing speech ?... dunno

akoboldfrying|2 months ago

I claim that all of thinking can be reduced to predicting the next thing. Predicting the next thing = thinking in the same way that reading and writing strings of bytes is a universal interface, or every computation can be done by a Turing machine.

Libidinalecon|2 months ago

A motorcycle is not "sprinting" and an LLM is not "thinking". Everyone would agree that a motorcycle is not running but the same dumb shit is posted over and over and over on here that somehow the LLM is "thinking".

crazygringo|2 months ago

But your assertion is merely semantic. It doesn't say anything substantive.

I could also say a motorcycle "moves forward" just like a person "moves forward". Whether we use the same or different words for same or different concepts doesn't say anything about the actual underlying similarity.

And please don't call stuff "dumb shit" here. Not appropriate for HN.

Extasia785|2 months ago

A forklift is "lifting" things, despite using a completely different mechanical process as a human "lifting" things. The only real similarity between these kinds of "lifting" is the end result, something is higher up than it was before.

DoctorOetker|2 months ago

is this seriously about continuous rotation versus a pair of double pendulums making a stepping motion?

MattRix|2 months ago

That’s because the motorcycle thing is too simlistic of a comparison. It doesn’t come nearly close to capturing the nuance of the whole LLM “thinking” situation.

efitz|2 months ago

AI has made me question what it is to be a human.

I am not having some existential crisis, but if we get to a point where X% of humans cannot outperform “AI” on any task that humans deem “useful”, for some nontrivial value of X, then many assumptions that culture has inculcated into me about humanity are no longer valid.

What is the role of humans then?

Can it be said that humans “think” if they can’t think a thought that a non thinking AI cannot also think?

tjr|2 months ago

If all humans were suddenly wiped off the face of the earth, AI would go silent, and the hardware it runs on would eventually shut down.

If all AI was suddenly wiped off the face of the earth, humans would rebuild it, and would carry on fine in the meantime.

One AI researcher decades ago said something to the effect of: researchers in biology look at living organisms and wonder how they live; researchers in physics look at the cosmos and wonder what all is out there; researchers in artificial intelligence look at computer systems and wonder how they can be made to wonder such things.

throw4847285|2 months ago

But the "AI" is simply a gestalt of all human language. You're looking in a mirror.

zkmon|2 months ago

It may be doing the "thinking" and could reach AGI. But we don't want it. We don't want to take a fork lift to the gym. We don't want plastic aliens showing off their AGI and asking humanity to outsource human thinking and decision-making to them.

perrygeo|2 months ago

Predicting the next token is not at all the same thing as predicting the next action in a causal chain of actions. Not even close. One is model of language tokens, the other is a model of the physical world. You can come up with all sorts of predictions that can't be expressed cleanly in natural language. And plenty of things that parse cleanly from a language perspective but are unhinged in their description of empirical reality.

voidhorse|2 months ago

When you have a thought, are you "predicting the next thing"—can you confidently classify all mental activity that you experience as "predicting the next thing"?

Language and society constrains the way we use words, but when you speak, are you "predicting"? Science allows human beings to predict various outcomes with varying degrees of success, but much of our experience of the world does not entail predicting things.

How confident are you that the abstractions "search" and "thinking" as applied to the neurological biological machine called the human brain, nervous system, and sensorium and the machine called an LLM are really equatable? On what do you base your confidence in their equivalence?

Does an equivalence of observable behavior imply an ontological equivalence? How does Heisenberg's famous principle complicate this when we consider the role observer's play in founding their own observations? How much of your confidence is based on biased notions rather than direct evidence?

The critics are right to raise these arguments. Companies with a tremendous amount of power are claiming these tools do more than they are actually capable of and they actively mislead consumers in this manner.

ctoth|2 months ago

> When you have a thought, are you "predicting the next thing"

Yes. This is the core claim of the Free Energy Principle[0], from the most-cited neuroscientist alive. Predictive processing isn't AI hype - it's the dominant theoretical framework in computational neuroscience for ~15 years now.

> much of our experience of the world does not entail predicting things

Introspection isn't evidence about computational architecture. You don't experience your V1 doing edge detection either.

> How confident are you that the abstractions "search" and "thinking"... are really equatable?

This isn't about confidence, it's about whether you're engaging with the actual literature. Active inference[1] argues cognition IS prediction and action in service of minimizing surprise. Disagree if you want, but you're disagreeing with Friston, not OpenAI marketing.

> How does Heisenberg's famous principle complicate this

It doesn't. Quantum uncertainty at subatomic scales has no demonstrated relevance to cognitive architecture. This is vibes.

> Companies... are claiming these tools do more than they are actually capable of

Possibly true! But "is cognition fundamentally predictive" is a question about brains, not LLMs. You've accidentally dismissed mainstream neuroscience while trying to critique AI hype.

[0] https://www.nature.com/articles/nrn2787

[1] https://mitpress.mit.edu/9780262045353/active-inference/

Ukv|2 months ago

> can you confidently classify all mental activity that you experience as "predicting the next thing"? [...] On what do you base your confidence in their equivalence?

To my understanding, bloaf's claim was only that the ability to predict seems a requirement of acting intentionally and thus that LLMs may "end up being a component in a system which actually does think" - not necessarily that all thought is prediction or that an LLM would be the entire system.

I'd personally go further and claim that correctly generating the next token is already a sufficiently general task to embed pretty much any intellectual capability. To complete `2360 + 8352 * 4 = ` for unseen problems is to be capable of arithmetic, for instance.

Yhippa|2 months ago

Boo LLM-generated comments!

bloaf|2 months ago

> When you have a thought, are you "predicting the next thing"—can you confidently classify all mental activity that you experience as "predicting the next thing"?

So notice that my original claim was "prediction is fundamental to our ability to act with intent" and now your demand is to prove that "prediction is fundamental to all mental activity."

That's a subtle but dishonest rhetorical shift to make me have to defend a much broader claim, which I have no desire to do.

> Language and society constrains the way we use words, but when you speak, are you "predicting"?

Yes, and necessarily so. One of the main objections that dualists use to argue that our mental processes must be immaterial is this [0]:

* If our mental processes are physical, then there cannot be an ultimate metaphysical truth-of-the-matter about the meaning of those processes.

* If there is no ultimate metaphysical truth-of-the-matter about what those processes mean, then everything they do and produce are similarly devoid of meaning.

* Asserting a non-dualist mind therefore implies your words are meaningless, a self-defeating assertion.

The simple answer to this dualist argument is precisely captured by this concept of prediction. There is no need to assert some kind of underlying magical meaning to be able to communicate. Instead, we need only say that in the relevant circumstances, our minds are capable of predicting what impact words will have on the receiver and choosing them accordingly. Since we humans don't have access to each other's minds, we must not learn these impacts from some kind of psychic mind-to-mind sense, but simply from observing the impacts of the words we choose on other parties; something that LLMs are currently (at least somewhat) capable of observing.

[0] https://www.newdualism.org/papers/E.Feser/Feser-acpq_2013.pd...

If you read the above link you will see that they spell out 3 problems with our understanding of thought:

Consciousness, intentionality, and rationality.

Of these, I believe prediction is only necessary for intentionality, but it does have some roles to play in consciousness and rationality.

micromacrofoot|2 months ago

Yes, personally I'm completely fine with the fact that LLMs don't actually think. I don't care that they're not AGI, though the hysterics about "AGI is so close now" seems silly to me. Fusion reactors and self-driving cars are just around the same corner.

They prove to have some useful utility to me regardless.

gamerDude|2 months ago

I'm an LLMs are being used in workflows they don't make sense in-sayer. And while yes, I can believe that LLMs can be part of a system that actually does think, I believe that to achieve true "thinking", it would likely be a system that is more deterministic in its approach rather than probabilistic.

Especially when modeling acting with intent. The ability to measure against past results and think of new innovative approaches seems like it may come from a system that may model first and then use LLM output. Basically something that has a foundation of tools rather than an LLM using MCP. Perhaps using LLMs to generate a response that humans like to read, but not in them coming up with the answer.

Either way, yes, its possible for a thinking system to use LLMs (and potentially humans piece together sentences in a similar way), but its also possible LLMs will be cast aside and a new approach will be used to create an AGI.

So for me: even if you are an AI-yeasayer, you can still believe that they won't be a component in an AGI.

visarga|2 months ago

You can make a separate model for the task, which is based on well chosen features and calibrated from actual data. Then the LLM only needs to generate the arguments to this model (extract those features from messages) and call it like a MCP tool. This external tool can be a simple Sklearn model.

jampekka|2 months ago

A good heuristic is that if an argument resorts to "actually not doing <something complex sounding>" or "just doing <something simple sounding>" etc, it is not a rigorous argument.

bamboozled|2 months ago

The issue is that prediction is "part" of the human thought process, it's not the full story...

bloaf|2 months ago

And the big players have built a bunch of workflows which embed many other elements besides just "predictions" into their AI product. Things like web search, to incorporating feedback from code testing, to feeding outputs back into future iterations. Who is to say that one or more of these additions has pushed the ensemble across the threshold and into "real actual thinking."

The near-religious fervor which people insist that "its just prediction" makes me want to respond with some religious allusions of my own:

> Who is this that wrappeth up sentences in unskillful words? Gird up thy loins like a man: I will ask thee, and answer thou me. Where wast thou when I laid up the foundations of the earth? tell me if thou hast understanding. Who hath laid the measures thereof, if thou knowest? or who hath stretched the line upon it?

The point is that (as far as I know) we simply don't know the necessary or sufficient conditions for "thinking" in the first place, let alone "human thinking." Eventually we will most likely arrive at a scientific consensus, but as of right now we don't have the terms nailed down well enough to claim the kind of certainty I see from AI-detractors.

throwaway150|2 months ago

> The issue is that prediction is "part" of the human thought process, it's not the full story...

Do you have a proof for this?

Surely such a profound claim about human thought process must have a solid proof somewhere? Otherwise who's to say all of human thought process is not just a derivative of "predicting the next thing"?

observationist|2 months ago

It's fascinating when you look at each technical component of cognition in human brains and contrast against LLMs. In humans, we have all sorts of parallel asynchronous processes running, with prediction of columnar activations seemingly the fundamental local function, with tens of thousands of mini columns and regions in the brain corresponding to millions of networked neurons using the "predict which column fires next" objective to increment or decrement the relative contribution of any functional unit.

In the case of LLMs you run into similarities, but they're much more monolithic networks, so the aggregate activations are going to scan across billions of neurons each pass. The sub-networks you can select each pass by looking at a threshold of activations resemble the diverse set of semantic clusters in bio brains - there's a convergent mechanism in how LLMs structure their model of the world and how brains model the world.

This shouldn't be surprising - transformer networks are designed to learn the complex representations of the underlying causes that bring about things like human generated text, audio, and video.

If you modeled a star with a large transformer model, you would end up with semantic structures and representations that correlate to complex dynamic systems within the star. If you model slug cellular growth, you'll get structure and semantics corresponding to slug DNA. Transformers aren't the end-all solution - the paradigm is missing a level of abstraction that fully generalizes across all domains, but it's a really good way to elicit complex functions from sophisticated systems, and by contrasting the way in which those models fail against the way natural systems operate, we'll find better, more general methods and architectures, until we cross the threshold of fully general algorithms.

Biological brains are a computational substrate - we exist as brains in bone vats, connected to a wonderfully complex and sophisticated sensor suite and mobility platform that feeds electrically activated sensory streams into our brains, which get processed into a synthetic construct we experience as reality.

Part of the underlying basic functioning of our brains is each individual column performing the task of predicting which of any of the columns it's connected to will fire next. The better a column is at predicting, the better the brain gets at understanding the world, and biological brains are recursively granular across arbitrary degrees of abstraction.

LLMs aren't inherently incapable of fully emulating human cognition, but the differences they exhibit are expensive. It's going to be far more efficient to modify the architecture, and this may diverge enough that whatever the solution ends up being, it won't reasonably be called an LLM. Or it might not, and there's some clever tweak to things that will push LLMs over the threshold.

moralIsYouLie|2 months ago

most humans in any percentile act towards the thing of someone else. most of these things are a lot worse than what the human "would originally intend". this behavior stems from 100s and thousands of nudges since childhood.

the issue with AI and AI-naysayers is, by analogy, this: cars were build to drive from A to Z. people picked up tastes and some people started building really cool looking cars. the same happens on the engineering side. then portfolio communists came with their fake capitalism and now cars are build to drive over people but don't really work because people, thankfully, are overwhelming still fighting to attempt to act towards their own intents.

Nevermark|2 months ago

Exactly. Our base learning is by example, which is very much learning to predict.

Predict the right words, predict the answer, predict when the ball bounces, etc. Then reversing predictions that we have learned. I.e. choosing the action with the highest prediction of the outcome we want. Whether that is one step, or a series of predicted best steps.

Also, people confuse different levels of algorithm.

There are at least 4 levels of algorithm:

• 1 - The architecture.

This input-output calculation for pre-trained models are very well understood. We put together a model consisting of matrix/tensor operations and few other simple functions, and that is the model. Just a normal but high parameter calculation.

• 2 - The training algorithm.

These are completely understood.

There are certainly lots of questions about what is most efficient, alternatives, etc. But training algorithms harnessing gradients and similar feedback are very clearly defined.

• 3 - The type of problem a model is trained on.

Many basic problem forms are well understood. For instance, for prediction we have an ordered series of information, with later information to be predicted from earlier information. It could simply be an input and response that is learned. Or a long series of information.

• 4 - The solution learned to solve (3) the outer problem, using (2) the training algorithm on (1) the model architecture.

People keep confusing (4) with (1), (2) or (3). But it is very different.

For starters, in the general case, and for most any challenging problem, we never understand their solution. Someday it might be routine, but today we don't even know how to approach that for any significant problem.

Secondly, even with (1), (2), and (3) exactly the same, (4) is going to be wildly different based on the data characterizing the specific problem to solve. For complex problems, like language, layers and layers of sub-solutions to sub-problems have to be solved, and since models are not infinite in size, ways to repurpose sub-solutions, and weave together sub-solutions to address all the ways different sub-problems do and don't share commonalities.

Yes, prediction is the outer form of their solution. But to do that they have to learn all the relationships in the data. And there is no limit to how complex relationships in data can be. So there is no limit on the depths or complexity of the solutions found by successfully trained models.

Any argument they don't reason, based on the fact that they are being trained to predict, confuses at least (3) and (4). That is a category error.

It is true, they reason a lot more like our "fast thinking", intuitive responses, than our careful deep and reflective reasoning. And they are missing important functions, like a sense of what they know or don't. They don't continuously learn while inferencing. Or experience meta-learning, where they improve on their own reasoning abilities with reflection, like we do. And notoriously, by design, they don't "see" the letters that spell words in any normal sense. They see tokens.

Those reasoning limitations can be irritating or humorous. Like when a model seems to clearly recognize a failure you point out, but then replicates the same error over and over. No ability to learn on the spot. But they do reason.

Today, despite many successful models, nobody understands how models are able to reason like they do. There is shallow analysis. The weights are there to experiment with. But nobody can walk away from the model and training process, and build a language model directly themselves. We have no idea how to independently replicate what they have learned, despite having their solution right in front of us. Other than going through the whole process of retraining another one.

nottorp|2 months ago

This is the "but LLMs will get better, trust me" thread?

sublinear|2 months ago

LLMs merely interpolate between the feeble artifacts of thought we call language.

The illusion wears off after about half an hour for even the most casual users. That's better than the old chatbots, but they're still chatbots.

Did anyone ever seriously buy the whole "it's thinking" BS when it was Markov chains? What makes you believe today's LLMs are meaningfully different?

stavros|2 months ago

Did anyone ever seriously buy the whole "it's transporting" BS when it was wheelbarrows? What makes you believe today's trucks are meaningfully different?

mapontosevenths|2 months ago

I suspect that people instinctively believe they have free will, both because it feels like we do, and because society requires us to behave that way even when we don't.

The truth is that the evidence says we don't. See the Libet experiment and its many replications.

Your decisions can be predicted from brain scans up to 10 seconds before you make them, which means they are as deterministic as an LLM's. Sorry, I guess.

Hendrikto|2 months ago

> Your decisions can be predicted from brain scans up to 10 seconds before you make them, which means they are as deterministic as an LLM's.

This conclusion does not follow from the result at all.

namero999|2 months ago

Libet has only measured the latency of metaconsciousness/cognition, nothing else. It says nothing about free will, which is ill defined anyway.

beepbooptheory|2 months ago

What is the import of this to you here? Whether you have free will or you feel like you do, kinda same difference for this particular point right? It doesn't make me more human actually having free will, it is sufficient to simply walk around as if I do.

But beyond that, what do you want to say here? What is lost, what is gained? Are you wanting to say this makes us more like an LLM? How so?

jnd-cz|2 months ago

I looked up the Libet experiment:

"Implications

The experiment raised significant questions about free will and determinism. While it suggested that unconscious brain activity precedes conscious decision-making, Libet argued that this does not negate free will, as individuals can still choose to suppress actions initiated by unconscious processes."