(no title)
FatherOfCurses | 13 days ago
People are putting trust in LLM's to provide answers to questions that they haven't properly formed and acting on solutions that the LLM's haven't properly understood.
And please don't tell me that people need to provide better prompts. That's just Steve Jobs saying "You're holding it wrong" during AntennaGate.
jmward01|13 days ago
Retric|13 days ago
godelski|13 days ago
Are you criticizing LLMs? Highlighting the importance of this training and why we're trained that way even as children? That it is an important part of what we call reasoning?
Or are you giving LLMs the benefit of the doubt, saying that even humans have these failure modes?[0]
Though my point is more that natural language is far more ambiguous than I think people give credit to. I'm personally always surprised that a bunch of programmers don't understand why programming languages were developed in the first place. The reason they're hard to use is explicitly due to their lack of ambiguity, at least compared to natural languages. And we can see clear trade offs with how high level a language is. Duck typing is both incredibly helpful while being a major nuisance. It's the same reason even a technical manager often has a hard time communicating instructions. Compression of ideas isn't very easy
[0] I've never fully understood that argument. Wouldn't we call a person stupid for giving a similar answer? How does the existence of stupid mean we can't call LLMs stupid? It's simultaneously anthropomorphising while being mechanistic.
cracki|13 days ago
I did not catch that in the first pass.
I read it as the casualties, who would be buried wherever the next of kin or the will says they should.
yakbarber|13 days ago
contravariant|13 days ago
That's also something people seem to miss in the Turing Test thought experiment. I mean sure just deceiving someone is a thing, but the simplest chat bot can achieve that. The real interesting implications start to happen when there's genuinely no way to tell a chatbot apart.
TheJoeMan|13 days ago
jader201|13 days ago
The problem is that most LLM models answer it correctly (see the many other comments in this thread reporting this). OP cherry picked the few that answered it incorrectly, not mentioning any that got it right, implying that 100% of them got it wrong.
thinkling|13 days ago
That seems problematic for a very basic question.
Yes, models can be harnessed with structures that run queries 100x and take the "best" answer, and we can claim that if the best answer gets it right, models therefore "can solve" the problem. But for practical end-user AI use, high error rates are a problem and greatly undermine confidence.
rluna828|12 days ago
serial_dev|13 days ago
raincole|13 days ago
You can even see those in this very thread. Some commenters even believe that they add internal prompts for this specific question (as if people are not attempting to fish ChatGPT's internal prompts 24/7. As if there aren't open weight models that answer this correctly.)
You can't never win.
jlarocco|13 days ago
pvillano|13 days ago
pvillano|13 days ago
I know nothing about chemistry. My smartest move was to not provide the color and ask what the color might have been. It never guessed blue or purple.
In fact, it first asked me if this was highschool or graduate chemistry. That's not... and it makes me think I'll only get answers to problems that are easily graded, and therefore have only one unambiguous solution
rluna828|12 days ago
xdennis|13 days ago
It would be interesting to actually ask a group a people this question. I'm pretty sure a lot of people would fail.
It feels like one of those puzzles which people often fail. E.g: 'Ten crows are sitting on a power line. You shoot one. How many crows are left to shoot?' People often think it's a subtraction problem and don't consider that animals flee after gunshots. (BTW, ChatGPT also answers 9.)
dingaling|13 days ago
Loughla|13 days ago
The difference between someone who is really good with LLM's and someone who isn't is the same as someone who's really good with technical writing or working with other people.
Communication. Clear, concise communication.
And my parents said I would never use my English degree.
biot|13 days ago
rluna828|12 days ago
CamperBob2|13 days ago