top | item 45928110

(no title)

zgk7iqea | 3 months ago

it is an architecture problem, too. LLMs simply aren't capable of AGI

discuss

Why not?

A lot of people say that, but no one, not a single person has ever pointed out a fundamental limitation that would prevent an LLM from going all the way.

If LLMs have limits, we are yet to find them.

lossolo|3 months ago

We have already found limitations of the current LLM paradigm, even if we don't have a theorem saying transformers can never be AGI. Scaling laws show that performance keeps improving with more params, data + compute but only following a smooth power law with sharply diminishing returns. Each extra order of magnitude of compute buys a smaller gain than the last, and recent work suggests we're running into economic and physical constraints on continuing this trend indefinitely.

OOD is still unsolved problem, they basically struggle under domain shifts and long tail cases or when you try systematically new combinations of concepts (especially on reasoning heavy tasks). This is now a well documented limitation of LLMs/multimodal LLMs.

Work on COT faithfulness shows that the step by step reasoning they print doesn't match their actual internal computation, they frequently generate plausible but misleading explanations of their own answers (lookup anthropic paper). That means they lack self knowledge about how/why they got a result. I doubt you can get AGI without that.

None of this proves that no LLM based architecture could ever reach AGI. But it directly contradicts the idea that we haven't found any limits. We've already found multiple major limitations of the current LLMs, and there's no evidence that blindly scaling this recipe is enough to cross from very capable assistant to AGI.

knollimar|3 months ago

Real time learning that doesn't pollute limited context windows.

dehsge|3 months ago

LLMs are bounded by the same bounds computers are. They run on computers so a prime example of a limitation is Rices theorem. Any ‘ai’ that writes code is unable (just like humans) to determine if the output is or is not error free.

This means a multi agent workflow without human that writes code may or may not be error free.

LLMs are also bounded by runtime complexity. Could an llm find the shortest Hamiltionian path between two cities in non polynomial time?

LLMs are bounded by in model context: Could an llm create and use a new language with no context in its model?