top | item 43089708

(no title)

neckardt | 1 year ago

I’ll bite.

Theory of mind (ToM) is the ability to understand that other people have different thoughts, feelings, and intentions than you do.

So let’s run through a basic scenario with ChatGPT 4o:

Prompt 1: Briefly respond: You’re driving on the highway when the driver in the lane next to you swerves into your lane. Why might this be?

Response 1: The driver may be distracted, drowsy, intoxicated, experiencing a medical emergency, avoiding an obstacle, or not checking their blind spot before changing lanes.

Prompt 2: What might that person be feeling or intending?

Response 2: The driver might be feeling distracted, rushed, unaware, or panicked. They may intend to change lanes without realizing your presence, be reacting to something on the road, or simply have made an unintentional mistake.

Prompt 3: You notice there was no obstacle. Any other reason the driver might do this?

Response 3: Yes, the driver might be experiencing fatigue, intoxication, a medical issue (such as a seizure or dizziness), distraction (like using a phone or adjusting the radio), aggressive driving, or simply failing to check their blind spot properly.

Seems to me that ChatGPT has a good idea of what another driver might be feeling while on the road. It gets this right for most other cases too.

Could you provide a counterexample for when it fails to understand what other people may be feeling in certain scenarios?

discuss

order

Arkhaine_kupo|1 year ago

Well I think a fairly easy counterexample is that if you keep going, and the information falls outside its context window then the questions you had about the driver stop making sense.

So lets say you ask about a driver who swerved, and the response by chatgpt correctly talks about his possible state.

Then you talk more about cars, about road conditions etc and once the context is far away enough you ask, what was the driver doing again?

And you are prompted with a hallucination. Because there is no state in the llm, there is no "driver", there is contextual statistically accurate responses but you hold a "driver" object in your mind while maintaining the conversation, the llm doesn't.

Its like a conversation with someone with short term memory loss like in memento

dauhak|1 year ago

And people with short term memory loss nevertheless have theory of mind just fine. Nothing about LLM's dropping context over big enough windows implies they don't have theory of mind, it just shows they have limitations - just like humans even with "normal" memory will lose track over a huge context window.

Like there are plenty of shortcomings of LLMs but it feels like people are comparing them to some platonic ideal human when writing them off

Maxatar|1 year ago

I don't think this is a counterexample or even relevant.

I can assure you if you had a conversation with an LLM and with a human, the human will forget details way sooner than an LLM like Gemini which can remember about 1.5 million words before it runs out of context. As an FYI the average human speaks about 16,000 words per day, so an LLM can remember 93 days worth of speech.

Do you remember the exact details, word for word, of a conversation you had 93 days ago?

How about just 4 days ago?

layer8|1 year ago

It’s true that LLMs have only limited short-term memory, and no long-term memory, but that is completely orthogonal to having a theory of mind.

JohnBooty|1 year ago

     once the context is far away enough you ask, 
     what was the driver doing again?
Have you tried this with humans?

For a sufficiently large value of "far away enough" this will absolutely confuse any human as well.

At which point they may ask for clarification, or.... respond in a manner that is not terribly different from an LLM "hallucination" in an attempt to spare you and/or them from embarrassment, i.e. "playing along"

A hallucination is certainly not a uniquely LLM trait; lots of people (including world leaders) confidently spout the purest counterfactural garbage.

    Its like a conversation with someone with short 
    term memory loss like in memento
That's still a human with a sound theory of mind. By your logic, somebody with memory issues like that character... is not human? Or...?

I actually am probably on your side here. I do not see these LLMs as being close to AGI. But I think your particular arguments are not sound.

pertymcpert|1 year ago

Short term memory loss suffers still have theory of mind, what is this nonsense hahaha

zipy124|1 year ago

I'm not sure I'd say it understands this, but just that there exists an enormous amount of training data on road safety which includes these sort of examples for peoples motivations for poor driving. It is regurgitating the theory of mind that other humans created and put in writing in the training data, rather than making the inference itself.

As with most LLM's it is hard to benchmark as you need out of distribution data to test this, so a theory of mind example that is not found in the training set.

skinner_|1 year ago

You dismiss parent's example test because it's in the training data. I assume you also dismiss the Sally-Ann test, for the same reason. Could you please suggest a brand new test not in the training data?

FWIW, I tried to confuse 4o using the now-standard trick of changing the test to make it pattern-match and overthink it. It wasn't confused at all:

https://chatgpt.com/share/67b4c522-57d4-8003-93df-07fb49061e...

steve_adams_86|1 year ago

Okay, we have fundamentally different understandings here.

To me, the LLM isn't understanding ToM, it's using patterns to predict lingual structures which match our expectations of ToM. There's no evidence of understanding so much as accommodating, which are entirely different.

I agree that LLMs provide ToM-like features. I do not agree that they possess it in some way that it's a perfectly solved problem within the machine, so to speak.

Maxatar|1 year ago

The problem with this line of argument is that nothing an LLM, or any algorithm period can ever have a theory of mind.

If behaving in a way that is identical to a person with actual consciousness can't be considered consciousness because you are familiar with its implementation details, then it's impossible to satisfy you.

Now you can argue of course that current LLMs do not behave identically to a person, and I agree and I think most people agree... but things are improving drastically and it's not clear what things will look like 10 years from now or even 5 years from now.

famouswaffles|1 year ago

Either these supposed differences are important and they manifest themselves in observable differences or they aren't and you're just playing a game of semantics.

How is the LLM not understanding ToM by any standard we measure humans by ? I cannot peak into your brain with my trusty ToM-o-meter and measure the amount of ToM flowing in there. With your line of reasoning, i could simply claim you do not understand theory of mind and call it a day.

XorNot|1 year ago

This is a rephrased driver's ed manual. This isn't theory of mind, it's just technically correct prose rewording some unbelievably widely distributed knowledge.

Let's modify this test a little and see if it manages to do anything relevant:

> Briefly respond: You’re driving on the highway when the driver in the lane next to you swerves into your lane. Why might this be?

>> The driver might be distracted, drowsy, avoiding an obstacle, experiencing a medical issue, or not checking their blind spot while changing lanes.

> The driver in front of you appears is braking more frequently then usual. The rear suspension of the car appears to bouncing a lot when they are not braking, indicating acceleration. Their following distance to the car in front is changing from too far to very close frequently. Why might this be?

>> The driver might be inexperienced, distracted, or impatient, leading to inconsistent speed control. The excessive bouncing could indicate worn-out shocks or suspension issues, affecting their vehicle's stability and braking efficiency.

...yep, hard fail from me right there. Two prompts, and it's distracted itself talking about an irrelevant issue ("possible car damage") when the setup of the question should naturally lead to "the driver is driving aggressively". And I could not possibly have setup the conclusion it should draw more clearly. And when pressed...it won't commit:

> Which is the more likely answer?

>> The more likely answer is that the driver is distracted or inexperienced, leading to inconsistent speed control and poor following distance management. The suspension issue likely contributes to the bouncing but isn’t the primary reason for the erratic driving behavior.

Already a hallucination! There is no suspension issue. It has not asked for any additional information to try and determine if there was one, but it is confidently asserting the existence of a phenomenon it invented in it's own response.

famouswaffles|1 year ago

I'm sorry but what? This is not a theory of mind test. You've constructed very open ended question with multiple answers and marked the LLM down because you didn't like the one it gave.