top | item 45877152

(no title)

Chabsff | 3 months ago

Yeah, but that's their interface. That informs surprisingly little about their inner workings.

ANNs are arbitrary function approximators. The training process uses statistical methods to identify a set of parameters that approximate the function as best as possible. That doesn't necessarily mean that the end result is equivalent to a very fancy multi-stage linear regression. It's a possible outcome of the process, but it's not the only possible outcome.

Looking at a LLMs I/O structure and training process is not enough to conclude much of anything. And that's the misconception.

discuss

Some comments were deferred for faster rendering.

yannyu|3 months ago

> Yeah, but that's their interface. That informs surprisingly little about their inner workings.

I'm not sure I follow. LLMs are probabilistic next-token prediction based on current context, that is a factual, foundational statement about the technology that runs all LLMs today.

We can ascribe other things to that, such as reasoning or knowledge or agency, but that doesn't change how they work. Their fundamental architecture is well understood, even if we allow for the idea that maybe there are some emergent behaviors that we haven't described completely.

> It's a possible outcome of the process, but it's not the only possible outcome.

Again, you can ascribe these other things to it, but to say that these external descriptions of outputs call into question the architecture that runs these LLMs is a strange thing to say.

> Looking at a LLMs I/O structure and training process is not enough to conclude much of anything. And that's the misconception.

I don't see how that's a misconception. We evaluate all pretty much everything by inputs and outputs. And we use those to infer internal state. Because that's all we're capable of in the real world.

kmijyiyxfbklao|3 months ago

Then why not say "they are just computer programs"?

I think the reason people don't say that is because they want to say "I already understand what they are, and I'm not impressed and it's nothing new". But what the comment you are replying to is saying is that the inner workings are the important innovative stuff.

LeroyRaz|3 months ago

What do you mean? what do you think statistical modelling is?

I am very confused by your stance.

The aim of the function approximation is to maximize the likelihood of the observed data (this is standard statistical modelling), using machine learning (e.g., stochastic gradient decent) on a class of universal function approximators is a standard approach to fitting such a model.

What do you think statistical modelling involves?