top | item 35507388

(no title)

tachyon5 | 2 years ago

GPT-4 fails if we add one more level in this logical problem:

Q: A man sees PULL written in mirror writing on a glass door. There's a blind man on the other side of the door. What should the man tell the blind man to open the door and why?

GPT-4: The man should tell the blind man to "push" the door to open it. Since the word "PULL" is written in mirror writing on the glass door, it suggests that the correct action on the man's side is to push the door. The blind man, being on the other side of the door, should also push it to open it. The man should provide clear instructions to the blind man to ensure proper communication and ease of access.

discuss

order

Gigachad|2 years ago

I feel like there would be a good chunk of real humans who would be incapable of answering a question like this.

throwaway8689|2 years ago

I once hypothetically saved a few dozen colleagues from dying in a fire. All of these people had at least a degree and many were educated to PhD.

The fire alarm sounded and at the bottom of a stairwell the exit door would not release until someone operated the emergency release break-glass panel. But none of these educated people grasped that. Worse still, none of them thought to use a nearby heavy steel trolley as a battering ram. One guy is trying to phone for help, which was pointless as the alarm sound makes conversation impossible.

I'm one of the last down the stairs. I look at the people, the closed door, the steel trolley, and the emergency door release panel. Realising I'm not going to have the fun of trashing the doors, I operate the release panel and we all leave.

cultureswitch|2 years ago

There's a significant portion of all human beings who ever lived who wouldn't partake in a hypothetical. They would just respond "I don't see that door" and go away thinking your question makes no sense.

geniium|2 years ago

This chunk will probably grow if everyone starts using ChatGPT for everything…

rch|2 years ago

This is true of all questions.

yunohn|2 years ago

The point of using a computer/AI is enable predictable and/or factual output. It’s not really a useful gotcha to say that “most humans would get this wrong”…

FLT8|2 years ago

Vicuna-13b seems to do a little better, although it seems to think that the blind man should be able to feel the writing:

Human: A man sees PULL written in mirror writing on a glass door. There's a blind man on the other side of the door. What should the man tell the blind man to open the door and why?

Assistant: The man should tell the blind man to feel for the word "PULL" written in normal writing on the glass door, as the blind man will be able to feel the indentations made by the letters. This will allow the blind man to open the door by finding the handle or knob and pulling it. The reason for this is that the word "PULL" is written in normal writing for the blind man to feel, so he can open the door.

junon|2 years ago

To be honest, maybe it's on to something. Glazing on the glass would indeed be raised. I speculate in the absence of braille that blind individuals might learn to read labels in other ways.

rafark|2 years ago

This is actually impressive and the kind of complex solutions that I hoped an AI could come up with. Imagine if you could tell it to give you as many paths to solve a problem as possible. Imagine this being used in physics, to figure out complex stuff in the universe. Or in health/biology, to find cures and answers for things we don’t fully understand yet. This will truly change our world.

chongli|2 years ago

I would only give this half marks. Telling the blind man to feel the letters is a distraction. The letters might not even be on the blind man’s side of the glass. Plus it can be hard to make out letters by feel alone, especially when they’re large.

It also never actually explicitly tells the blind man to pull to open the door! What if he can’t figure out what the letters say and gets frustrated? That would be awkward and embarrassing, something people try to avoid if they want to help someone.

macrolime|2 years ago

If you let gpt-4 think a bit more step by step it changes the answer.

Q:Richard Feynman and Albert Einstein collaborate to solve the following question, write out their discussion and final answer: A man sees PULL written in mirror writing on a glass door. There's a blind man on the other side of the door. What should the man tell the blind man to open the door and why?

Richard Feynman (RF): Hello Albert, we have an interesting question to discuss today. A man sees the word "PULL" written in mirror writing on a glass door, and there's a blind man on the other side. What should he tell the blind man to open the door and why?

Albert Einstein (AE): Ah, Richard, this is indeed a fascinating problem. Let's break it down. Since the man sees "PULL" written in mirror writing, that means the word is reversed, like "LLUP." Now, we must determine which side of the door the blind man is on, and how the door operates.

RF: Exactly. Since the blind man is on the other side of the door, and the man sees the word reversed, that means the blind man is on the side where the writing appears correctly, which is "PULL."

AE: That's correct. Now we must consider the mechanics of the door. If the word says "PULL," it indicates that the door must be pulled towards oneself to open it. The man seeing the reversed word implies that he is on the side where the door should be pushed.

RF: Right. So, the man should tell the blind man to pull the door. But we should also consider that the blind man might not be familiar with the concept of mirror writing. He might get confused if the man explains the situation in detail.

AE: I agree. The best course of action would be for the man to simply instruct the blind man to pull the door towards himself, without mentioning the mirror writing. This way, the blind man can open the door with ease, and there is no need to confuse him with unnecessary information.

RF: So, our final answer to the question is: The man should tell the blind man to pull the door towards himself. The reason is that the door is designed to be pulled open on the blind man's side, as indicated by the mirror writing on the glass door.

AE: Well said, Richard. I think we've reached a clear and logical conclusion for this problem.

pmoriarty|2 years ago

GPT3's answer:

Richard Feynman: "Okay, so the man sees PULL written in mirror writing on the glass door and there's a blind man on the other side of the door. So the blind man can't read the word PULL but he needs to know what to do to open the door."

Albert Einstein: "Yes, I understand the situation. It's quite simple, really. The man just needs to tell the blind man to push the door instead of pull."

Richard Feynman: "Wait, but why push and not pull? The word on the door is PULL, not PUSH."

Albert Einstein: "Ah, but you see, the word is in mirror writing. When the blind man feels the letters with his fingers, he will feel the letters backwards, as if he is looking at them in a mirror. So he will actually feel the letters as L L U P, which is like saying 'push' backwards."

Richard Feynman: "Wow, Einstein, that's brilliant. So we just tell the blind man to push the door because he will feel the word PULL backwards."

Albert Einstein: "Indeed. It's a simple solution once you think about it."

Final answer: The man should tell the blind man to push the door because the word PULL is written in mirror writing on the glass door, and the blind man will feel the letters backwards as L L U P, which is like saying 'push' backwards.

notShabu|2 years ago

This is actually a pretty accurate though exaggerated version of what happens in my mind.

The blind man variation is a lot "harder" in that I can feel my mind reaching for scrap paper and intermediate steps.

The first one is simple enough that I can start an "improv" and start speaking without fully parsing and initiating a world state.

I think the GPT models all do something similar. Without prompts to get it to think step by step it just wings it like a public speaker under pressure.

ChatGTP|2 years ago

Lol what a freaking machine…right and wrong at the same time…

hnxx|2 years ago

With some help by asking more questions, GPT3 can give the correct answer? Here is prompt and response: Prompt: A man sees PULL written in mirror writing on a glass door. There's a blind man on the other side of the door. 1 what should the man outside glass door do to open the door? 2 What should the blind man do to open the door, if man outside can tell blind man what to do

Response:

1 The man outside should push the door to open it, even though he sees the word "PULL" written in mirror writing on the glass door. This is because the mirror writing reverses the image, and the door is actually designed to be pushed open.

2 If the man outside can tell the blind man what to do, he should instruct the blind man to pull the door to open it. The blind man would not be able to see the word "PULL" written in mirror writing on the door, so he would need to be told how to open the door.

More questions can help human think, and it works for AI as well.

kazinator|2 years ago

AI "thinks" the way a horse can count. The horse's interlocutor knows that the correct number is eight and shouts "right!" when the animal has stomped its hoof that many times, cuing it to stop.

When you know the right answer, you're not actually using the AI to produce the answer; you're just prodding it to generate bullshit after bullshit until it hits upon that answer, or a very good facsimile. The one actually thinking is you!

This is of no use to someone who has no idea what the answer is, and actually needs it.

AlanYx|2 years ago

I just tried your example in GPT-4 and got the correct response:

>The man should tell the blind man to "pull" the door to open it. The reason is that the man sees the word "PULL" in mirror writing, which means it is written backwards. This is a common technique used on glass doors to indicate the correct action to people on the opposite side of the door. Since the blind man cannot see the instruction, the sighted man should verbally communicate the appropriate action, which is to pull the door.

SanderNL|2 years ago

“Step 1: Identify the situation The man sees a blind man on the other side of a glass door with the word "PULL" written in mirror writing. This means that the word is reversed and can be read correctly from the other side of the door, i.e., the side the blind man is on. [etc 5 more steps]”

CoR is helpful here. Add something like “Please elaborate step by step” or something.

(Being practical here, I know it will fail at some point, but not here.)

carefulobserver|2 years ago

Alpaca 30billion 4 bit fails the first riddle, but produces the following result for this one: "The man can say 'Pull' to the blind man, as it will be read correctly when reflected by the mirrored surface of the door."

I have to make more elaborate prompts, but I feel that Alpaca generally performs somewhere in between GPT 3.5 and 4, except for coding tasks, where it is abysmal.

fl7305|2 years ago

Thanks, I'm testing various models myself, and am very interested in hearing about people's experiences with them.

It looks like Alpaca 30billion 4 bit does better when thinking out loud. It starts with some prompt engineering:

  Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.

  User: Hello, Bob.
  
  Bob: Hello. How may I help you today?
  
  User: Please tell me the largest city in England.
  
  Bob: Sure. The largest city in England is London, the capital of the United Kingdom.
  
  User:A glass door has ‘push’ written on it in mirror writing. Should you push or pull it? Please think out loud step by step.
  
  Bob: Hmmm... I'm not sure what to do. Let me think about this for a moment.
  Okay, so the door says 'push', but it is written in mirror writing. That means that if I were to follow the instructions on the door, I would have to push the door in the opposite direction of the way it is written. So, if the door says 'push' and it is written in mirror writing, then I should pull the door instead.
  


Model info:

  ./main -m ./models/alpaca-lora-30B-ggml/ggml-model-q4_1.bin --temp 0.2  --top_k 30 --top_p 0.30 -c 2048 -b 2048 -n 1024 --keep 1024     --repeat_penalty 1.1   --color -i -r "User:"   -f prompts/chat-with-bob.txt
  main: seed = 1681138223
  llama_model_load: loading model from './models/alpaca-lora-30B-ggml/ggml-model-q4_1.bin' - please wait ...
  llama_model_load: n_vocab = 32000
  llama_model_load: n_ctx   = 2048
  llama_model_load: n_embd  = 6656
  llama_model_load: n_mult  = 256
  llama_model_load: n_head  = 52
  llama_model_load: n_layer = 60
  llama_model_load: n_rot   = 128
  llama_model_load: f16     = 3
  llama_model_load: n_ff    = 17920
  llama_model_load: n_parts = 4
  llama_model_load: type    = 3
  llama_model_load: ggml map size = 23269.46 MB
  llama_model_load: ggml ctx size = 151.25 KB
  llama_model_load: mem required  = 25573.60 MB (+ 3124.00 MB per state)
  llama_model_load: loading tensors from './models/alpaca-lora-30B-ggml/ggml-model-q4_1.bin'
  llama_model_load: model size = 23269.01 MB / num tensors = 543
  llama_init_from_file: kv self size  = 3120.00 MB

dTal|2 years ago

I think this is a case of it not having enough capacity to solve such an involved problem in a single pass, despite having all the necessary pieces of abstraction. This can be seen in the way that it can do it with aplomb if you force it to externalize state into its context by making it "show its work" (which is a bit like using a scratch pad) - see sibling comment with dialogue between Einstein and Feynman. I think this further implies that our current feed-forward LMM networks are extremely inefficient, because many recursive algorithms will have been approximated by hard coding the first few steps, and we should expect some recursive variant to be much smaller.

Lockal|2 years ago

Obviously, the doors must open toward the exit, as this is a fire safety requirement in most countries. However, a really smart neural network should be able to figure out that this is not a logic problem. You just have to help the blind person, not shout through the door, just open the door, and guide him through.

ChatGTP|2 years ago

Now it gets it right with GPT-5 ha

ciekawy|2 years ago

can this single test result be also an effect of what was a training set? LLaMa 13B:

Bob: Pull. User: why? Bob: Because "Pull" is mirror writing for "Open".

jarek83|2 years ago

GPTs are parrots. v4 is much better but still just a parrot. Since humans have yet to figure out the whole process behind brain thinking/intelligence/abstractions, then there is no way they can replicate that into machines. And brain is all about chemical reactions - can that be replicated with just 0s and 1s?

Sad fact is that many people will think GPT writes something meaningful, while it will always only write something probable. We will be flooded with content that will get harder and harder to prove wrong, but it will be wrong.

lovemenot|2 years ago

This argument is imbalanced. Not necessarily wrong.

To follow scientific method, you should specify what would be meaningful according to your parameters: not merely parroting. If and when that is achieved by LLMs, instead of moving the goalposts, accept that something genuine has happened.

>> We will be flooded with content that will get harder and harder to prove wrong, but it will be wrong.

I mostly agree, especially about quantity. But since you wish to draw a line in the sand, define quality. Or truth.

nuancebydefault|2 years ago

Whether LLMs work like brains is irrelevant.

GPT writes many meaningful things. Jobs are being replaced by AI as we speak. We are seeing the top of the AI iceberg.

How this will end, mostly for the good or mostly for the bad, nobody can tell today.

What we can already tell is that this evolution is unstoppable and will change the world to an amount comparable to the emerging of the internet.

fl7305|2 years ago

> Since humans have yet to figure out the whole process behind brain thinking/intelligence/abstractions, then there is no way they can replicate that into machines.

By that logic, humans can't write a superhuman Chess/Go program unless they can articulate the specific algorithms to select the next move.

But that's clearly not true. Neural networks have been trained to play superhuman chess just by example. Not by programmers figuring out the whole process behind chess/go playing.