Comparison to GPT-OSS-20B (irrespective of how you feel that model actually performs) doesn't fill me with confidence. Given GLM 4.7 seems like it could be competitive with Sonnet 4/4.5, I would have hoped that their flash model would run circles around GPT-OSS-120B. I do wish they would provide an Aider result for comparison. Aider may be saturated among SotA models, but it's not at this size.
syntaxing|1 month ago
victorbjorklund|1 month ago
unsupp0rted|1 month ago
Not for code. The quality is so low, it's roughly on par with Sonnet 3.5