top | item 46944358

(no title)

theorchid | 20 days ago

In my opinion, Opus 4.6 handles instructions (especially negative rules) better than Opus 4.5 and Sonnet 4.5. It also applies skills better than other models.

But I think this is individual.

Another example. I asked gpt-5.2-codex to add an array of 5 values to the script and write a small piece of code. Then I manually deleted one of the values in the array and asked the agent to commit. But the model edited the file again and added the value I deleted to the array. I deleted that value again and asked the agent to “just commit.” But the agent edited the file again before committing. This happened many times, and I used different commands, such as “never edit the file, just commit.” The model responded that it understood the command and began editing the file. I switched to gpt-5.2, but that didn't help.

I switched to sonnet-4.5, and it immediately committed on the first try without editing the file.

discuss

No comments yet.