top | item 41921635

Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces

44 points| hislaziness | 1 year ago |arxiv.org

15 comments

AI researchers need to read more cognitive science. It is genuinely embarrassing how often you see "Thinking Fast and Slow" + some 50-year-old paper as the only citations, because this statement:

  In human cognition theory, human thinking is governed by two systems: the fast and intuitive System 1 and the slower but more deliberative System 2.

is intuitive, psychologically seductive, and blatantly wrong.[1] There is no scientific distinction between System 1 and System 2, the very idea is internally incoherent and contradicts the evidence. Yet tons of ignorant people believe it. And apparently AI researchers sincerely believe "ANN inference = System 1 thinking." This is ridiculous: ANN inference = Pavlovian response, as found in nematodes and jellyfish. But System 1 thinking is related to common sense found in all vertebrates, and absent from all existing AI. We don't have a clue how to make a computer capable of System 1 thinking.

This isn't just pedantry: the initial "System 1 = inference" error makes "System 2 = chain-of-thought" especially flawed. CoT in transformer LLMs helps solve O(n) problems but struggles with O(n^2). The observation that a O(n^2) problem can be broken down into n separate O(n) problems is ultimately due to system 1 reasoning: it is obviously true. But it is only obviously true to smart things like humans and pigeons. Transformers do not seem smart enough to grasp it: system 2 thinking must be "glued together" by tautologies or axioms, and we can only recognize tautologies or discover axioms because of system 1. If the problem is more complex than O(n) these tautologies and axioms must be provided to the LLM, either with a careful prompt or exhaustive data.

Kahneman's book has been largely repudiated on the science. That doesn't mean it isn't a useful way to understand the kinds of errors humans make in decision-making. But it does make the book useless for AI researchers: I believe AGI is well over 200 years away, because going all the way back to Alan Turing AI has simply refused to engage with the challenges of cognitive science, preferring fairy tales which confirm intuitions and trivialize human minds.

[1] https://www.cell.com/trends/cognitive-sciences/abstract/S136... and https://www.psychologytoday.com/intl/blog/a-hovercraft-full-...

thorum|1 year ago

> We don't have a clue how to make a computer capable of System 1 thinking.

I think you’re overthinking this. System 1 thinking as the term is being used by AI researchers means making a fast decision based on reasoning processes that are wired into your brain by evolution. For any task that humans have faced for millions of years this works well. It can also work well for experts in a domain who have practiced a task so many times that their brains have adapted to perform it unconsciously.

System 2 thinking is consciously using explicit reasoning techniques to think through a problem, slowly and rigorously, often in ways that feel unnatural due to our cognitive biases but can solve problems that System 1 is unable to.

The analogy to LLMs is straightforward: LLMs learn to solve many kinds of complex problems during training and encode processes for those specific problems. They can then perform these tasks in a single forward pass through their weights. This is System 1 for LLMs and again, works well for any task that they were exposed to repeatedly during training.

However they don’t generalize to tasks that were not well represented in the training data. Training them to use explicit reasoning strategies instead (System 2) is shown to improve performance and let them solve a broader range of problems.

lukev|1 year ago

So, in one sense I agree with you. There is zero evidence that the human brain runs separate systems for separate types of cognition.

On the other hand, the reason this idea is sticky is because it matches our conscious experience. In some situations, we respond intuitively. In other situations, we choose to work analytically using tools like research, deliberation, note-taking, etc.

I think it's this second sense in which people are using the term with respect to LLMs. And it's not a terrible analogy.

However, comparing "neural networks" to actual neurons is almost never useful.

SubiculumCode|1 year ago

That line in the abstract made me chuckle too, as a cognitive psychologist of memory (but currently doing autism research). Its an analogy, not some well validated law of cognition. I think there are a lot of concepts in cognitive psychology that may be useful to machine learning research, but they should actually, maybe, invite some cognitive scientists to work with them on their machine learning research (or at least help them craft their abstracts lol)

viraptor|1 year ago

> CoT in transformer LLMs helps solve O(n) problems but struggles with O(n^2).

What do you mean by `n` in this case?

jhanschoo|1 year ago

Yeah, they could call it something like retrieval-dominant information generation or computation-dominant information generation, or at least not mention cognitive sciences.

jonstewart|1 year ago

Thank you, I was hoping someone in the comments would point out that all this junk hasn't been replicable.

hislaziness|1 year ago

As I understand, the LLM uses the techniques of searchformer - https://arxiv.org/abs/2402.14083. To do "slow thinking" doing a A* search using a transofrmer.