It feels like Anthropic's models from 6 months ago. I mean, it's great progress in the open weight world, but I don't have time to use anything less than the very best for the coding I do. At the same time, if Anthropic and OpenAI disappeared tomorrow, I could survive with GLM-5.
apples_oranges|18 days ago
machiaweliczny|18 days ago
Codex: better with rate-limits, 5.2 strong with logic problems
Cursor: cursor auto - a bit dumb still but I use the most for writing not really thinking, it's also good at searching through codebase and doing summaries etc.
Claude / Codex still miss tons of scaffolding for sane development or it's due to sandboxes or sth. Like for example you ask in /plan mode to check think with link to github and it does navigate github via curl, hitting rate limits etc. instead of just git clone, repomix etc. so scaffolding still matters a lot. Like it still lacks a tons of common sense
egeozcan|18 days ago
I also can create a feedback loop and let it run wild, which also works but that needs also planning and a harness, and rules etc. Usually not worth it if you need to jump between a million things like me.
9cb14c1ec0|18 days ago
Gud|18 days ago