top | item 42993116

(no title)

xendo | 1 year ago

Any idea if the same dataset can be used to improve human reasoning? Let's say I manually analyze 817 math examples, would that be optimal strategy for me to improve my math reasoning? Can the same distilation process be applied to leetcode?

discuss

order

viraptor|1 year ago

This training is less about learning how to reason and more about conditioning the llm to use self-evaluations automatically. You could probably reproduce this effect yourself by sticking a paper reminder in front of you and writing "after every small step, spend 2 minutes considering if it's right and does it work in the context of the task so far; evaluate alternatives" on it. (which yes, could improve reasoning likely)