top | item 42772954

(no title)

justinl33 | 1 year ago

> This is the first open research to validate that reasoning capabilities of LLMs can be incentivized purely through RL, without the need for SFT.

This is a noteworthy achievement.

discuss

order