top | item 44800499

(no title)

jzig | 6 months ago

I'm confused by how Opus is presented to be superior in nearly every way for coding purposes yet the general consensus and my own experience seem to be that Sonnet is much much better. Has anyone switched to entirely using Opus from Sonnet? Or maybe switching to Opus for certain things while using Sonnet for others?

discuss

order

SkyPuncher|6 months ago

I don't doubt Opus is technically superior, but it's not practically superior for me.

It's still pretty much impossible to have any LLM one-shot a complex implementation. There's just too many details to figure out and too much to explain for it to get correct. Often, there's uncertainty and ambiguity that I only understand the correct answer (or rather less bad answer) after I've spent time deep in the code. Having Opus spit out a possibly correct solution just isn't useful to me. I need to understand _why_ we got to that solution and _why_ it's a correct solution for the context I'm working in.

For me, this means that I largely have an iteratively driven implementation approach where any particular task just isn't that complex. Therefore, Sonnet is completely sufficient for my day-to-day needs.

bdamm|6 months ago

I've been having a great time with Windsurf's "Planning" feature. Have a nice discussion with Cascade (Claude) all about what it is that neerds to happen - sometimes a very long conversation including test code. Then when everything is very clear, make it happen. Then test and debug the results with all that context. Pretty nice.

ssk42|6 months ago

You can also always have it create design docs and mermaid diagrams for each task. Outline the why much easier earlier, shifting left

adastra22|6 months ago

Every time that Sonnet is acting like it has brain damage (which is once or twice a day), I switch to Opus and it seems to sort things out pretty fast. This is unscientific anicdata though, and it could just be that switching models (any model) would have worked.

gpm|6 months ago

This seems like a case of reversion to the mean. When one model is performing below average, changing anything (like switching to another model) is likely to improve it by random chance...

monatron|6 months ago

This is a great use case for sub-agents IMO. By default, sub-agents use sonnet. You can have opus orchestrate the various agents and get (close to) the best of both worlds.

HarHarVeryFunny|6 months ago

Maybe context rot? If model's output seems to be getting worse or in a rut, then try just clearing context / starting a new session.

j45|6 months ago

They both seem to behave differently depending on how loaded the system seems to be.

aghilmort|6 months ago

switching models great best practice whether get stuck or not

can look at primal check the mean or dual get out of local minima

in all cases, model, tokenizer, etc is just enough different that will generally pay off in spaces quickly

Uehreka|6 months ago

> yet the general consensus and my own experience seem to be that Sonnet is much much better

Given that there’s nothing close to scientific analysis going on, I find it hard to tell how big the “Sonnet is overall better, not just sometimes” crowd is. I think part of the problem is that “The bigger model is better” feels obvious to say, so why say it? Whereas “the smaller model is better actually” feels both like unobvious advice and also the kind of thing that feels smart to say, both of which would lead to more people who believe it saying it, possibly creating the illusion of consensus.

I was trying to dig into this yesterday, but every time I come across a new thread the things people are saying and the proportions saying what are different.

I suppose one useful takeaway is this: If you’re using Claude Max and get downgraded from Opus to Sonnet for a few hours, you don’t have to worry too much about it being a harsh downgrade in quality.

MostlyStable|6 months ago

Opus seems better to me on long tasks that require iterative problem solving and keeping track of the context of what we have already tried. I usually switch to it for any kind of complicated troubleshooting etc.

I stick with Sonnet for most things because it's generally good enough and I hit my token limits with it far less often.

unshavedyak|6 months ago

Same. I'm on the $200 plan and I find Opus "better", but Sonnet is more straight forward. Sonnet is, to me, a "don't let it think" model. It does great if you give it concrete and small goals. Anything vague or broad and it starts thinking and it's a problem.

Opus gives you a bit more rope to hang yourself with imo. Yes, it "thinks" slightly better, but still not good enough to me. But it can be good enough to convince you that it can do the job.. so i dunno, i almost dislike it in this regard. I find Sonnet just easier to predict in this regard.

Could i use Opus like i do Sonnet? Yes definitely, and generally i do. But then i don't really see much difference since i'm hand-holding so much.

jm4|6 months ago

I use both. Sonnet is faster and more cost efficient. It's great for coding. Where Opus is noticeably better is in analysis. It surpasses Sonnet for debugging, finding patterns in data, creativity and analysis in general. It doesn't make a lot of sense to use Opus exclusively unless you're on a max20 plan and not hitting limits. Using Opus for design and troubleshooting and Sonnet for everything else is a good way to go.

biinjo|6 months ago

Im on the Max plan and generally Opus seems to do better work than Sonnet. However, that’s only when they allow me to use Opus. The usage limits, even on the max plan, are a joke. Yesterday I hit the limits within MINUTES of starting my work day.

furyofantares|6 months ago

I'm a bit confused by people hitting usage limits so quickly.

I use Opus exclusively and don't hit limits. ccusage reports I'm using the API-equivalent of $2000/mo

Aeolun|6 months ago

Is this on x5? Because ever since they booted all the freeloaders I’ve not once seen the “you are approaching usage limits” message. Anyway, the “you are approaching usage limits” message shows up when you are over 50% of your tokens for that timeframe, so it’s not sure useful.

epolanski|6 months ago

Yeah, you need to actively cherry pick which model to use in order to not waste tokens on stuff that would be easily handed by a simpler model.

dsrtslnd23|6 months ago

same here constantly hit the Opus limits after minutes on Max plan

dested|6 months ago

If I'm using cursor then sonnet is better, but in claude code Opus 4 is at least 3x better than Sonnet. As with most things these days, I think a lot of it comes down to prompting.

jzig|6 months ago

This is interesting. I do use Cursor with almost exclusively Sonnet and thinking mode turned on. I wonder if what Cursor does under the hood (like their indexing) somehow empowers Sonnet more. I do not have much experience with using Claude Code.

datameta|6 months ago

I now eagerly await Sonnet 4.1, only because of this release.

astrostl|6 months ago

With aggressive Claude Code use I didn't find Sonnet better than Opus but I did find it faster while consuming far fewer tokens. Once I switched to the $100 Max plan and configured CC to exclusively use Sonnet I haven't run into a plan token limit even once. When I saw this announcement my first thing was to CMD-F and see when Sonnet 4.1 was coming out, because I don't really care about Opus outside of interactive deep research usage.

sothatsit|6 months ago

Opus really shines for completing long-running tasks with no supervision. But if you are using Claude Code interactively and actively steering it yourself, Sonnet is good enough and is faster.

I don't believe anyone saying Sonnet yields better results than Opus though, as my experience has been exactly the opposite. But trade-off wise, I can definitely see it being a better experience when used interactively because of its speed and lower cost.

cmrdporcupine|6 months ago

Strategy I'm playing with, we'll see how good of results I get, is to prompt Opus to analyze and plan but not implement.

E.g. prompt to read a paper, read some source, then write out a terse document meant to be read by machine not human.

Then switch to Sonnet, have it read that document, and do the actual implementation work.

chisleu|6 months ago

I use opus or gemini 2.5 pro for plan mode and sonnet for act mode in Cline. https://cline.bot

It's my experience that Opus is better at solving architectural challenges where sonnet struggles.

gpm|6 months ago

I notice that on the "Agentic Coding" benchmark cited in the article Sonnet 4 outperformed Opus 4 (by 0.2%), and under performs Opus 4.1 (by -1.8%).

So this release might change that consensus? If you believe the benchmarks are reflective of reality anyways.

jimbo808|6 months ago

> If you believe the benchmarks are reflective of reality anyways.

That's a big "if." But yeah, I can't tell a difference subjectively between Opus and Sonnet, other than maybe a sort of placebo effect. I'm more careful to write quality prompts when using Opus, because I don't want to waste the 5x more expensive tokens.

Aeolun|6 months ago

My opinion of Opus is that it takes the correct action 19/20 times, where Sonnet takes the correct action 18/20 times. It’s not strictly necessary to use Opus, but if you have the subscription already it’s just a pure win.

sky2224|6 months ago

I've found with limited context provided in your prompt, opus is just awful compared to even gpt-4.1, but once I give it even just a little bit more of an explanation, it jumps leagues ahead.

brenoRibeiro706|6 months ago

I feel the same way. I usually use Opus to help with coding and documentation, and I use Sonnet for emails and so on.

tkz1312|6 months ago

100% opus all the time. Sonnet seems to get confused much faster and need more hand holding in my experience.

rtfeldman|6 months ago

Yes, Opus is very noticeably better at programming in both Rust and Zig in my experience. I wish it were cheaper!

j45|6 months ago

Opus is superior to understand the big picture and the direction.

Sonnet is great at banging it out.

seunosewa|6 months ago

It's ridiculously overpriced in the API. Just like o3 used to be.

taormina|6 months ago

Just more ancedata, but I entirely agree. I can't say that I am happy with Sonnet's output at any point, really, but it still occasionally works, whereas Opus has been a dumpster fire every single time.

ssss11|6 months ago

That’s very strange. Sonnet is hot garbage and Opus is a miracle, for me. I also don’t see anyone praising sonnet anywhere.