top | item 42840956

(no title)

veesahni | 1 year ago

Reasoning models are a result of the learnings from CoT prompting.

discuss

order

s1mplicissimus|1 year ago

I'm curious what are the key differences between "a reasoning model" and good old CoT prompting. Is there any reason to believe that the fundamental limitations of prompting don't apply to "reasoning models"? (hallucinations, plainly wrong output, bias towards to training data mean etc.)

itchyjunk|1 year ago

The level of sophistication for CoT model varies. "good old CoT prompting" is you hoping the model generates some reasoning tokens prior to the final answer. When it did, the answers tended to be better for certain class of problems. But you had no control over what type of reasoning tokes it was generating. There were hypothesis that just having a <pause> tokens in between generated better answers as it allowed n+1 steps to generate an answer over n. I would consider Meta's "continuous chain of thought" to be on the other end of "good old CoT prompting" where they are passing back the next tokens from the latent space back to the model getting a "BHF" like effect. Who knows what's happening with O3 and Anthropics O3 like models.. The problems you mentioned is very broad and not limited to prompting. Reasoning models tend to outperform older models on math problems. So I'd assume it does reduce hallucination on certain class of problems.