top | item 45750484

(no title)

srush | 4 months ago

The blog talks about the training process. Specifically we trained with RL post-training on coding examples.

discuss

chis|4 months ago

Makes sense, but what model was used for the base? Is it some open-source model, and you're not at liberty to disclose?

W0WL0LXD|4 months ago

not a Cursor employee but still a researcher, it’s Zhipu/Z.ai GLM-4.6/4.5. there’s traces of Chinese in the reasoning output + its the only model that would make sense to do this with RL, and is a model that already delivers near SOTA performance + is open-source/open-weight.

Cursor Composer and Windsurf SWE 1.5 are both finetuned versions of GLM.

chaidhat|4 months ago

that's cool thanks!