top | item 45808682

(no title)

MrScruff | 3 months ago

When we say "think" in this context, do we just mean generalize? LLMs clearly generalize (you can give one a problem that is not exactly in it's training data and it can solve it), but perhaps not to the extent a human can. But then we're talking about degrees. If it was able to generalize at a higher level of abstraction maybe more people would regard it as "thinking".

discuss

dns_snek|3 months ago

I meant it in the same way the previous commenter did:

> Having seen LLMs so many times produce incoherent, nonsensical and invalid chains of reasoning... LLMs are little more than RNGs. They are the tea leaves and you read whatever you want into them.

Of course LLMs are capable of generating solutions that aren't in their training data sets but they don't arrive at those solutions through any sort of rigorous reasoning. This means that while their solutions can be impressive at times they're not reliable, they go down wrong paths that they can never get out of and they become less reliable the more autonomy they're given.

dagss|3 months ago

It's rather seldom that humans arrive at solutions through rigorous reasoning. The word "think" doesn't mean "rigorous reasoning" in every day language. I'm sure 99% of human decisions are pattern matching on past experience.

Even when mathematicians do in fact do rigorous reasoning, they use years to "train" first, to get experiences to pattern match from.

Workaccount2|3 months ago

I have been on a crusade now for about a year to get people to share chats where SOTA LLMs have failed spectacularly to produce coherent, good information. Anything with Heavy hallucinations and outright bad information.

So far, all I have gotten is data that is outside the knowledge cutoff (this is by far the most common) and technicality wrong information (Hawsmer House instead of Hosmer House) kind of fails.

I thought maybe I hit on something with the recent BBC study about not trusting LLM output, but they used 2nd shelf/old mid-tier models to do their tests. Top LLMs correctly answered their test prompts.

I'm still holding out for one of those totally off the rails Google AI overviews hallucinations showing up in a top shelf model.

MrScruff|3 months ago

Sure, and I’ve seen the same. But I’ve also seen the amount to which they do that decrease rapidly over time, so if that trend continues would your opinion change?

I don’t think there’s any point in comparing to human intelligence when assessing machine intelligence, there’s zero reason to think it would have similar qualities. It’s quite clear for the foreseeable future it will be far below human intelligence in many areas, while already exceeding humans in some areas that we regard as signs of intelligence.