top | item 42909139 (no title) marxplank | 1 year ago that would help with decidable problems but would still be not generalisable for problems with non trivial rewards, or ones with none. discuss order hn newest astrange|1 year ago Reasoning seems to generalize, insofar as o1 and DeepSeek-R1 are better at answering questions than their base models.
astrange|1 year ago Reasoning seems to generalize, insofar as o1 and DeepSeek-R1 are better at answering questions than their base models.
astrange|1 year ago