This should have been compared with Opus... I know OP says he didn't because of cost but if you're comparing who is better then you need to compare the best to the best... if Claude Opus 4.1 is significantly better than GPT 5 then that could offset the extra expense. Not saying it will... but forget cost if we want to compare solely the quality
nearbuy|6 months ago
In one instance, I asked it to optimize a roughly 80 line C# method that matches some object positions by object ID and delta encodes their positions from the previous frame. It seemed to be confused about how all this should work and output completely wrong code. It has all the context it needs in the file and the method is fairly self-contained. Other models did much better. GPT-5 understood what to do immediately.
I tried a few other tasks/questions that also had underwhelming results. Now I've switched to using GPT-5.
If you have a quick prompt you'd like me to try, I can share the results.
cpursley|6 months ago
bongodongobob|6 months ago
However, if I'm not detailed with it, it does seem to make weird choices that end up being unmaintainable. It's like it has poor creative instincts but is really good at following the directions you give it.
muzani|6 months ago
intellectronica|6 months ago
runako|6 months ago
Also keep in mind that many employees are not paying out of pocket for LLM use at work. A $1,000 monthly bill for LLM usage is high for an individual but not so much for a company that employees engineers.
michaelt|6 months ago
They're impressive despite that. But if Sonnet is $20/month and I have to intervene every 3 minutes, while Opus is $100/month and I have to intervene every 5 minutes? ¯\_(ツ)_/¯
qeternity|6 months ago
I think this is the whole reason not to compare it to Opus...
bgirard|6 months ago
sergiotapia|6 months ago
andsoitis|6 months ago
I believe Opus starts at $20 a month, similar to GPT5 if you want more than just cursory usage.
Or am I missing something?
markbao|6 months ago
unknown|6 months ago
[deleted]
fouc|6 months ago
senko|6 months ago
> Our smartest, fastest, and most useful model yet
I'd say it's definitely supposed to be the best, it just doesn't deliver.