top | item 44829767

(no title)

sharemywin | 6 months ago

the big step was having it reason through math problems that weren't in the training data. even now with web search it doesn't need every article in the training data to do useful things with it.

discuss

Ferrus91|6 months ago

This is using think time compute and reinforcement learning. I think this is going to plateau even faster than the initial LLM scaling though.