top | item 47060130

(no title)

crimsoneer | 11 days ago

I mean, the flipside is that we have been tricking humans with this sort of thing for generations. We've all seen a hundred variations on "A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?" or "If 5 machines take 5 minutes to make 5 widgets, how long do 100 machines take to make 100 widgets?" or even the whole "the father was the surgeon" story.

If you don't recognise the problem and actively engage your "system 2 brain", it's very easy to just leap to the obvious (but wrong) answer. That doesn't mean you're not intelligent and can't work it out if someone points out the problem. It's just the heuristics you've been trained to adopt betray you here, and that's really not so different a problem to what's tricking these llms.

discuss

order

imiric|11 days ago

But this is not a trick question[1]. It's a straightforward question which any sane human would answer correctly.

It may trigger a particularly ambiguous path in the model's token weights, or whatever the technical explanation for this behavior is, which can certainly be addressed in future versions, but what it does is expose the fact that there's no real intelligence here. For all its "thinking" and "reasoning", the tool is incapable of arriving at the logically correct answer, unless it was specifically trained for that scenario, or happens to arrive at it by chance. This is not how intelligence works in living beings. Humans don't need to be trained at specific cognitive tasks in order to perform well at them, and our performance is not random.

But I'm sure this is "moving the goalposts", right?

[1]: https://news.ycombinator.com/item?id=47060374

crimsoneer|11 days ago

But this one isn't a trick question either right... it's just basic maths, and a quirk of how our brain works that means plenty of people don't engage the part of their brain that goes "I should stop and think this through", and just rush to the first number that pops into their head. But that number is wrong, and is a result of our own weird "training" (in that we all have a bunch of mental shortcuts we use for maths, and sometimes they lead us astray).

"A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?"

And yet 50% of MIT students fall for this sort of thing[1]. They're not unintelligent, it's just a specific problem can make your brain fail in weird specific ways. Intelligence isn't just a scale from 0-100, or some binary yes or no question, it's a bunch of different things. LLMs probably are less intelligent on a bunch of scales, but this one specific example doesn't tell you much that they have weird quirks just like we do.

[1] https://www.aeaweb.org/articles?id=10.1257/08953300577519673...

valdork59|11 days ago

and how many variations of trick questions do you think the LLM has seen?