top | item 44244898

(no title)

The "next token prediction" is a distraction. That's not where the interesting part of an AI model happens.

If you think of the tokenization near the end as a serializer, something like turning an object model into json, you get a better understanding. The interesting part of a an OOP program is not in the json, but what happens in memory before the json is created.

Likewise, the interesting parts of a neural net model, whether it's LLM's, AlphaProteo or some diffusion based video model, happen in the steps that operate in their latent space, which is in many ways similar to our subconscious thinking.

In those layers, the AI models detect deeper and deeper patterns of reality. Much deeper than the surface pattern of the text, images, video etc used to train them. Also, many of these patterns generalize when different modalities are combined.

From this latent space, you can "serialize" outputs in several different ways. Text is one, image/video another. For now, the latent spaces are not general enough to do all equally well, instead models are created that specialize on one modality.

I think the step to AGI does not require throwing a lot more compute into the models, but rather to have them straddle multiple modalities better, in particular, these:

- Physical world modelling at the level of Veo3 (possibly with some lessons from self driving or robotics model for elements like object permananence and perception) - Symbolic processing of the best LLM's. - Ability to be goal oriented and iterate towards a goal, similar to the Alpha* family of systems - Optionally: Optimized for the use of a few specific tools, including a humanoid robot.

Once all of these are integrated into the same latent space, I think we basically have what it takes to replace most human thought.

discuss

sgt101|8 months ago

>which is in many ways similar to our subconscious thinking

this is just made up.

- we don't have any useful insight on human subconscious thinking. - we don't have any useful insight on the structures that support human subconscious thinking. - the mechanisms that support human cognition that we do know about are radically different from the mechanisms that current models use. For example we know that biological neurons & synapses are structurally diverse, we know that suppression and control signals are used to change the behaviour of the networks , we know that chemical control layers (hormones) transform the state of the system.

We also know that biological neural systems continuously learn and adapt, for example in the face of injury. Large models just don't do these things.

Also this thing about deeper and deeper realities? C'mon, it's surface level association all the way down!

ixtli|8 months ago

Yea whenever we get into this sort of “what’s happening in the network is like what’s going on in your brain” discussion people never have concrete evidence of what they’re talking about.

scarmig|8 months ago

The diversity is itself indicative, though, that intelligence isn't bound to the particularities of the human nervous system. Across different animal species, nervous systems show a radical diversity. Different architectures; different or reversed neurotransmitters; entirely different neural cell biologies. It's quite possible that "neurons" evolved twice, independently. There's nothing magic about the human brain.

Most of your critique is surface level: you can add all kinds of different structural diversity to an ML model and still find learning. Transformers themselves are formally equivalent to "fast weights" (suppression and control signals). Continuous learning is an entire field of study in ML. Or, for injury, you can randomly mask out half the weights of a model, still get reasonable performance, and retrain the unmasked weights to recover much of your loss.

Obviously there are still gaps in ML architectures compared to biological brains, but there's no particular reason to believe they're fundamental to existence in silico, as opposed to myelinated bags of neurotransmitters.

spot|8 months ago

not just made up: https://www.quantamagazine.org/self-taught-ai-shows-similari...

ben_w|8 months ago

The bullet list is a good point, but:

> We also know that biological neural systems continuously learn and adapt, for example in the face of injury. Large models just don't do these things.

This is a deliberate choice on the part of the model makers, because a fixed checkpoint is useful for a product. They could just keep the training mechanism going, but that's like writing code without version control.

> Also this thing about deeper and deeper realities? C'mon, it's surface level association all the way down!

To the extent I agree with this, I think it conflicts with your own point about us not knowing how human minds work. Do I, myself, have deeper truths? Or am myself I making surface level association after surface level association, but have enough levels to make it seem deep? I do not know how many grains make the heap.

phorkyas82|8 months ago

As far as I understood any AI model is just a linear combination of its training data. Even if that were such a large corpus as the entire web... it's still just like a sophisticated compression of other's people's expressions.

It has not made its own experiences, not interacted with the outer world. Dunno, I won't to rule out something operating solely on language artifacts cannot develop intelligence or consciousness, whatever that is,.. but so far there are also enough humans we could care about and invest into.

tlb|8 months ago

LLMs are not a linear combination of training data.

Some LLMs have interacted with the outside world, such as through reinforcement learning while trying to complete tasks in simulated physics environments.

olmo23|8 months ago

Just because humans can describe it, doesn't mean they can understand (predict) it.

And the web contains a lot more than people's expressions: think of all the scientific papers with tables and tables of interesting measurements.

andsoitis|8 months ago

> the AI models detect deeper and deeper patterns of reality. Much deeper than the surface pattern of the text

What are you talking about?