top | item 45854017

(no title)

This is more evidence that Cognition's SWE-1.5 is a GLM-4.6 finetune

discuss

Can you provide more context for this? (eg Was SWE-1.5 released recently? Is it considered good? Is it considered fast? Was there speculation about what the underlying model was? How does this prove that it's a GLM finetune?)

NitpickLawyer|3 months ago

People saw chinese characters in generations made by swe-1.5 (windsurfs model) and also in the one made by cursor. This led to suspicions that the models are finetunes of chinese models (which makes sense, as there aren't many us/eu strong coding models out there). GLM4.5/4.6 are the "strongest" coding models atm (with dsv3.2 and qwen somewhat behind) so that's where the speculation came from. Cerebras serving them at roughly the same speeds kinda adds to that story (e.g. if it'd be something heavier like dsv3 or kimik2 it would be slower).

mhuffman|3 months ago

I suspect they are referencing the 950tok/s claim on Cognition's page.

nl|3 months ago

Not at all. Any model with somewhat-similar architecture and roughly similar size should run at the same speed on Cerabras.

It's like saying Llama 3.2 3B and Gemma 4B are fine tunes of each other because they run at similar speeds on NVidia hardware.