top | item 47213936

(no title)

Reasoning allows to produce statements that are more likely to be true based on statements that are known to be true. You'd need to structure your "falsehood training data" in a specific way to allow an LLM to generalize as well as with the regular data (instead of memorizing noise). And then you'll get a reasoning model which remembers false premises.

You generate your text based on a "stochastic parrot" hypothesis with no post-validation it seems.

discuss

jaen|27 minutes ago

Really, how hard is it to follow HN guidelines and :

a) not imagine straw-man arguments and not imagine more (or less) than what was said

b) refrain from snarky and false ad hominems

None of what you said in no way conflicts with what I said, and again shows a fundamental misunderstanding.

Reasoning is (mostly) part of the post-training dataset. If you add a large majority of false (ie. paradoxical, irrational etc.) reasoning traces to those, you'll get a model that successfully replicates the false reasoning of humans. If you mix it in with true reasoning traces, I imagine you'll get infinite loop behaviour as the reasoning trace oscillates between the true and the false.

The original premise that truth is purely a function of the training dataset still stands... I'm not even sure what people are arguing here, as that seems quite trivially obvious?