(no title)
rors | 1 year ago
It's interesting that this paper doesn't contradict the conclusions of the Apple LLM paper[0], where prompts were corrupted to force the LLM into making errors. I can also believe that LLMs can only make small deviations from existing example solutions in creation of these novel solutions.
I hate that we're using the term "reasoning" for this solution generation process. It's a term coined by LLM companies to evoke an almost emotional response on how we talk about this technology. However, it does appear that we are capable of instructing machines to follow a series of steps using natural language, with some degree of ambiguity. That in of itself is a huge stride forward.
pfisherman|1 year ago
That being said, I do think that LLMs are capable of “verbal reasoning” operations. I don’t have a good sense of the boundaries that distinguish the logics - verbal, qualitative, quantitative reasoning. What comes to my mind is the verbal sections of standardized tests.
eru|1 year ago
Well, if you do all that, would you say that the system has a whole has 'reasoned'? (I think ChatGPT can already call out to Python.)
joe_the_user|1 year ago
You can believe it what sort of evidence are you using for this belief?
Edit: Also, the abstract of the Apple paper hardly says "corruption" (implying something tricky), it says that they changed the initial numerical values
famouswaffles|1 year ago
The only thing that does is the benchmark that introduces "seemingly relevant but ultimately irrelevant information"
fragmede|1 year ago
Anthropomorphizing computers has been happening long before ChatGPT. No one thinks their computer is actually eating their homework when they say that to refer to the fact that their computer crashed and their document wasn't saved, it's just an easy way to refer to the thing it just did. Before LLMs, "the computer is thinking" wasn't an unuttered sentence. Math terms aren't well known to everybody, so saying Claudr is dot-producting an essay for me, or I had ChatGPT dot-product that letter to my boss, no one knows that a dot product is, so even if that's a more technically accurate verb, who's gonna use it? So while AI companies haven't done anything to promote usage of different terms than "thinking" and "reasoning", it's also because those are the most handy terms. It "thinks" there are two R's in strawberries. It dot-products there are two R's in strawberries. It also matrix multiplies, occasionally softmaxes; convolves. But most people aren't Terence Tao and don't have a feel for when something's softmaxing because what even does that mean?
ucefkh|1 year ago
They still can't think outsider their box of datasets