(no title)
sundarurfriend | 2 months ago
Edit: Just noticed
> Also note pre-training and post-training benchmarks are different, so scores are not comparable across plots.
The paper gives more details about the specific benchmarks and the scores obtained in them: https://arxiv.org/html/2512.14856v1#S4
No comments yet.