top | item 38364017

(no title)

FFP999 | 2 years ago

At the moment I read "how to reason" in the headline my bullshit detector started to go off.

LLMs do not reason, they do not think, they are not AGI. They generate by regurgitating.

discuss

order

coderaptor|2 years ago

I haven’t heard a definition of “reasoning” or “thinking” that proves humans aren’t doing exactly that same probabilistic regurgitation.

I don’t think it’s possible to prove; feels like a philosophical question.

RationalDino|2 years ago

I won't define reasoning, just call out one aspect.

We have the ability to follow a chain of reasoning, say "that didn't work out", backtrack, and consider another. ChatGPT seems to get tangled up when its first (very good) attempt goes south.

This is definitely a barrier that can be crossed by computers. AlphaZero is better than we are at it. But it is a thing we do which we clearly don't simply do with the probabilistic regurgitation method that ChatGPT uses.

That said, the human brain combines a bunch of different areas that seem to work in different ways. Our ability to engage in this kind of reason, for example, is known to mostly happen in the left frontal cortex. So it seems likely that AGI will also need to combine different modules that work in different ways.

On that note, when you add tools to ChatGPT, it suddenly can do a lot more than it did before. If those tools include the right feedback loops, the ability to store/restore context, and so on, what could it then do? This isn't just a question of putting the right capabilities in a box. They have to work together for a goal. But I'm sure that we haven't achieved the limit of what can be achieved.

xanderlewis|2 years ago

It feels like humans do do a similar regurgitation as part of a reasoning process, but if you play around with LLMs and ask them mathematical questions beyond the absolute basics it doesn’t take long before they trip up and reveal a total lack of ‘understanding’ as we would usually understand it. I think we’re easily fooled by the fact that these models have mastered the art of talking like an expert. Within any domain you choose, they’ve mastered the form. But it only takes a small amount of real expertise (or even basic knowledge) to immediately spot that it’s all gobbledygook and I strongly suspect that when it isn’t it’s just down to luck (and the fact that almost any question you can ask has been asked before and is in the training data). Given the amount of data being swallowed, it’s hard to believe that the probabilistic regurgitation you describe is ever going to lead to anything like ‘reasoning’ purely through scaling. You’re right that asking what reasoning is may be a philosophical question, but you don’t need to go very far to empirically verify that these models absolutely do not have it.

cloverich|2 years ago

On the other hand, it seems rather intuitive we have a logic based component? Its the underpinning of science. We have to be taught when we've stumbled upon something that needs tested. But we can be taught that. And then once we learn to recognize it, we intuitively do so in action. ChatGPT can do this in a rudimentary way as well. It says a program should work a certain way. Then it writes it. Then it runs it. Then when the answer doesn't come out as expected (at this point, probably just error cases), it goes back and changes it.

It seems similar to what we do, if on a more basic level. At any rate, it seems like a fairly straight forward 1-2 punch that, even if not truly intelligent, would let it break through its current barriers.

CAP_NET_ADMIN|2 years ago

LLMs can be trained on all the math books in the world, starting from the easiest to the most advanced, they can regurgitate them almost perfectly, yet they won't apply the concepts in those books to their actions. I'd count the ability to learn new concepts and methods, then being able to use them as "reasoning".

intended|2 years ago

It’s possible to prove.

Use an LLM to do a real world task that you should be able to achieve by reasoning.

kgeist|2 years ago

Just yesterday I saw an example of a person asking GPT what "fluftable" means. The word was invented by their little daughter and they didn't know what it meant. GPT reasoned it was a portmaneau of"fluffy" and "comfortable", and it made sense because it was used in reference to a pillow. If it's just regurgitation, I'd like to know how it's able to understand novel words not found in the training data...

svaha1728|2 years ago

I would read Francois Chollet's explanation of this. It's very good: https://fchollet.substack.com/p/how-i-think-about-llm-prompt...

For words that are not in the model's vocabulary, like 'fluftable', the model uses a subword tokenization strategy. It breaks down the word into smaller known subunits (subwords or characters) and represents each subunit with its own vector. By understanding the context in which 'fluftable' appears and comparing it to known words with similar subunits, the model can infer a plausible meaning for the word. This is done by analyzing the vector space in which these representations exist, observing how the vectors align or differ from those of known words.

'As always, the most important principle for understanding LLMs is that you should resist the temptation of anthropomorphizing them.'

nighthawk454|2 years ago

Because you’re not understanding what it’s regurgitating. It’s not a fact machine that regurgitates knowledge, in fact it’s not really so good at that. It regurgitates plausible patterns of language, and combining words and such is hardly a rare pattern

intended|2 years ago

Which is also within the realms of house MD vs doctor, potentially even more so.

LLMs are trained on realms of text, good performance here is not unexpected.

To put it another way - Would you hire chat GPT?

For work, you need to have more than text skills.

QuadmasterXLII|2 years ago

With only the information we had in 2020, the two theories “language models don’t reason, they regurgitate” and “as language models scale, they begin to think and reason” made predictions, and the people who invested time and money based on the predictions of the latter theory have done well for themselves.

intended|2 years ago

The people who bet on generative tasks, are getting mileage out of tit.

People who bet on reasoning tasks, not so much.

FFP999|2 years ago

If you're trying to tell me there's a sucker born every minute, I knew that.

schleck8|2 years ago

AGI doesn't reason either. Noone defines AGI as "AI, but with reasoning". It's "AI, that outperforms humans at all disciplines, by any degree" usually. Maybe you confused it with ASI, but even then reasoning isn't a requirement afaik.

pelorat|2 years ago

Reasoning is a learnt concept that involves retrieving memories and running them though an algorithm, also retrieved from memory, and then you loop the process until a classifier deems the result to be adequate to the given goal.

sharemywin|2 years ago

I asked GPT 4 and it had some counter points:

Reasoning blends learned skills and natural cognition. It integrates new information, not just past memories. Reasoning is adaptable, not rigidly algorithmic. Emotions and context also shape reasoning.

which seemed to make sense.

GenericPoster|2 years ago

Did you only read the title? Because the abstract gives you a pretty good idea of what they mean when they say reason. It's pretty easy to understand. No need to immediately call bullshit just because of a minor semantic disagreement.

>ThEY DON'T tHiNk. They'rE JuSt STochAStiC pARrotS. It'S not ReAL AGi.

It doesn't even matter if these claims are true or not. They're missing the point of the conversation and the paper. Reason is a perfectly valid word to use. So is think. If you ask it a question and then follow up with 'think carefully' or 'explain carefully'. You'll get the same response.

inb4 AcTUALLy LlMS Can'T do aNYtHIng CaRefUlly BECaUse pRogRAms ARen'T caRefUl

gnaritas99|2 years ago

You are simply incorrect. They can reason.

riku_iki|2 years ago

and how can you tell they reason and not parrot some text in training data?

There are papers about trying LLMs on generated reasoning problems, and they usually fail.