(no title)
throwaway4aday | 1 year ago
> Notably, it is the first open research to validate that reasoning capabilities of LLMs can be incentivized purely through RL, without the need for SFT.
throwaway4aday | 1 year ago
> Notably, it is the first open research to validate that reasoning capabilities of LLMs can be incentivized purely through RL, without the need for SFT.
No comments yet.