top | item 44274232

(no title)

ncarlson | 8 months ago

> AI we don’t have a model.

So, some engineers just stumbled upon LLMs and said, "Holy smokes, we've created something impressive, but we really can't explain how this stuff works!"

We built these things. Piece by piece. If you don't understand the state-of-the-art architectures, I don't blame you. Neither do I. It's exhausting trying to keep up. But these technologies, by and large, are understood by the engineers that created them.

discuss

ijidak|8 months ago

Not true. How the higher level thought is occurring continues to be a mystery.

This is an emergent behavior that wasn’t predicted prior to the first breakthroughs which were intended for translation, not for this type of higher level reasoning.

Put it this way, if we truly understood how LLMs think perfectly we could predict the maximum number of parameters that would achieve peak intelligence and go straight to that number.

Just as we now know exactly the boundaries of mass density that yield a black hole, etc.

The fact that we don’t know when scaling will cease to yield new levels of reasoning means we don’t have a precise understanding of how the parameters are yielding higher levels of intelligence.

We’re just building larger and seeing what happens.

twelve40|8 months ago

> if we truly understood how LLMs think perfectly we could predict the maximum number of parameters that would achieve peak

It's a bit of a strange argument to make. We've been making airplanes for 100+ years, we understand how they work and there is absolutely no magic or emergent behavior in them, yet even today nobody can give an instant birth to the perfect-shape airframe, it's still a very long and complicated process of calculations, wind tunnel tests, basically trial and error. It doesn't mean we don't understand how airplanes work.

ncarlson|8 months ago

> How the higher level thought is occurring continues to be a mystery. This is an emergent behavior that wasn’t predicted prior to the first breakthroughs which were intended for translation, not for this type of higher level reasoning.

I'm curious what you mean by higher level thought (or reasoning). Can you elaborate or provide some references?

ninetyninenine|8 months ago

The engineers who built these things in actuality don’t understand how it works. Literally. In fact you can ask them and they say this readily. I believe the CEO of anthropic is quoted as saying this.

If they did understand LLMs why do they have so much trouble explaining why an LLM produced certain output? Why can’t they fully control an LLM?

These are algorithms running on computers which are deterministic machines that in theory we have total and absolute control over. The fact that we can’t control something running on this type of machine points to the sheer complexity and lack of understanding of the thing we are trying to run.

ninetyninenine|8 months ago

Put it this way Carlson. If you were building LLMs if you understood machine learning if you were one of these engineers who work at open ai, you would agree with me.

The fact that you don’t agree indicates you literally don’t get it. It also indicates you aren’t in any way an engineer who works on AI, because what I am talking about here is an unequivocal and universally held viewpoint held by literally the people who build these things.

stevenhuang|8 months ago

> But these technologies, by and large, are understood by the engineers that created them.

Simply incorrect. Look into the field of AI interpretability. The learned weights are black boxes, we don't know what goes on inside them.

Workaccount2|8 months ago

Models are grown, not built. The ruleset is engineered, the training framework built, but the model itself that grows through training is incredibly dense complexity.