My personal 5 cents is that reasoning will be there when LLM gives you some kind of outcome and then when questioned about it can explain every bit of result it produced.
For example, if we asked an LLM to produce an image of a "human woman photorealistic" it produces result. After that you should be able to ask it "tell me about its background" and it should be able to explain "Since user didn't specify background in the query I randomly decided to draw her standing in front of a fantasy background of Amsterdam iconic houses. Usually Amsterdam houses are 3 stories tall, attached to each other and 10 meters wide. Amsterdam houses usually have cranes on the top floor, which help to bring goods to the top floor since doors are too narrow for any object wider than 1m. The woman stands in front of the houses approximately 25 meters in front of them. She is 1,59m tall, which gives us correct perspective. It is 11:16am of August 22nd which I used to calculate correct position of the sun and align all shadows according to projected lighting conditions. The color of her skin is set at RGB:xxxxxx randomly" etc.
And it is not too much to ask LLMs for it. LLMs have access to all the information above as they read all the internet. So there is definitely a description of Amsterdam architecture, what a human body looks like or how to correctly estimate time of day based on shadows (and vise versa). The only thing missing is logic that connects all this information and which is applied correctly to generate final image.
I like to think about LLMs as a fancy genius compressing engines. They took all the information in the internet, compressed it and are able to cleverly query this information for end user. It is a tremendously valuable thing, but if intelligence emerges out of it - not sure. Digital information doesn't necessarily contain everything needed to understand how it was generated and why.
I see two approaches for explaining the outcome:
1. Reasoning back on the result and justifying it.
2. Explainability - somehow justifying by looking at which neurons have been called.
The first could lead to lying. E.g. think of a high schooler explaining copied homework.
While the second one does indeed access the paths influencing the decision, but is a hard task due to the inherent way neural networks work.
I think it's hard to enumerate the unknown, but I'd personally love to see how models like this perform on things like word problems where you introduce red herrings. Right now, LLMs at large tend to struggle mightily to understand when some of the given information is not only irrelevant, but may explicitly serve to distract from the real problem.
That’s not inability to reason though, that’s having a social context.
Humans also don’t tend to operate in a rigorously logical mode and understand that math word problems are an exception where the language may be adversarial: they’re trained for that special context in school. If you tell the LLM that social context, eg that language may be deceptive, their “mistakes” disappear.
What you’re actually measuring is the LLM defaults to assuming you misspoke trying to include relevant information rather than that you were trying to trick it — which is the social context you’d expect when trained on general chat interactions.
LLMs are still bound to a prompting session. They can't form long term memories, can't ponder on it and can't develop experience. They have no cognitive architecture.
'Agents' (i.e. workflows intermingling code and calls to LLMs) are still a thing (as shown by the fact there is a post by anthropic on this subject on the front page right now) and they are very hard to build.
Consequence of that for instance: it's not possible to have a LLM explore exhaustively a topic.
LLMs don’t, but who said AGI should come from LLMs alone. When I ask ChatGPT about something “we” worked on months ago, it “remembers” and can continue on the conversation with that history in mind.
I’d say, humans are also bound to promoting sessions in that way.
kinda interesting, every single CS person (especially phds) when talking about reasoning are unable to concisely quantify, enumerate, qualify, or define reasoning.
people with (high) intelligence talking and building (artificial) intelligence but never able to convincingly explain aspects of intelligence. just often talk ambiguously and circularly around it.
what are we humans getting ourselves into inventing skynet :wink.
its been an ongoing pet project to tackle reasoning, but i cant answer your question with regards to llms.
>> Kinda interesting, every single CS person (especially phds) when talking about reasoning are unable to concisely quantify, enumerate, qualify, or define reasoning.
Kinda interesting that mathematicians also can't do the same for mathematics.
john_minsk|1 year ago
For example, if we asked an LLM to produce an image of a "human woman photorealistic" it produces result. After that you should be able to ask it "tell me about its background" and it should be able to explain "Since user didn't specify background in the query I randomly decided to draw her standing in front of a fantasy background of Amsterdam iconic houses. Usually Amsterdam houses are 3 stories tall, attached to each other and 10 meters wide. Amsterdam houses usually have cranes on the top floor, which help to bring goods to the top floor since doors are too narrow for any object wider than 1m. The woman stands in front of the houses approximately 25 meters in front of them. She is 1,59m tall, which gives us correct perspective. It is 11:16am of August 22nd which I used to calculate correct position of the sun and align all shadows according to projected lighting conditions. The color of her skin is set at RGB:xxxxxx randomly" etc.
And it is not too much to ask LLMs for it. LLMs have access to all the information above as they read all the internet. So there is definitely a description of Amsterdam architecture, what a human body looks like or how to correctly estimate time of day based on shadows (and vise versa). The only thing missing is logic that connects all this information and which is applied correctly to generate final image.
I like to think about LLMs as a fancy genius compressing engines. They took all the information in the internet, compressed it and are able to cleverly query this information for end user. It is a tremendously valuable thing, but if intelligence emerges out of it - not sure. Digital information doesn't necessarily contain everything needed to understand how it was generated and why.
typhon04|1 year ago
concordDance|1 year ago
Large language models don't do that. You'd want an image model.
Or did you mean "multi-model AI system" rather than "LLM"?
phillipcarter|1 year ago
zmgsabst|1 year ago
Humans also don’t tend to operate in a rigorously logical mode and understand that math word problems are an exception where the language may be adversarial: they’re trained for that special context in school. If you tell the LLM that social context, eg that language may be deceptive, their “mistakes” disappear.
What you’re actually measuring is the LLM defaults to assuming you misspoke trying to include relevant information rather than that you were trying to trick it — which is the social context you’d expect when trained on general chat interactions.
Establishing context in psychology is hard.
KaoruAoiShiho|1 year ago
Xmd5a|1 year ago
'Agents' (i.e. workflows intermingling code and calls to LLMs) are still a thing (as shown by the fact there is a post by anthropic on this subject on the front page right now) and they are very hard to build.
Consequence of that for instance: it's not possible to have a LLM explore exhaustively a topic.
mjhagen|1 year ago
I’d say, humans are also bound to promoting sessions in that way.
unknown|1 year ago
[deleted]
mistermann|1 year ago
Luckily we don't know the problem exists, so in a cultural/phenomenological sense it is already cracked.
tim333|1 year ago
amelius|1 year ago
Does it include the invention of tools?
Agentus|1 year ago
people with (high) intelligence talking and building (artificial) intelligence but never able to convincingly explain aspects of intelligence. just often talk ambiguously and circularly around it.
what are we humans getting ourselves into inventing skynet :wink.
its been an ongoing pet project to tackle reasoning, but i cant answer your question with regards to llms.
YeGoblynQueenne|1 year ago
Kinda interesting that mathematicians also can't do the same for mathematics.
And yet.