top | item 47202203

(no title)

I just cancelled my OpenAI $200 sub yesterday because of all this, but sadly I can't agree.

Codex 5.3 Xhigh > Opus 4.6 in my work to this point.

Hoping for Opus 4.7 or whatever comes next to rectify this as I'm a bit annoyed over having to drop to a lower quality model.

discuss

lumirth|2 days ago

Weirdly enough, I agree with both sides. Opus beats every version of GPT 5 as a chat interface, hands down. ChatGPT, at this point, is mostly me correcting its output style, cadence, behavior, etc, and consistently remaining dissatisfied, meanwhile Opus one-shots things I didn’t even think it could (Typst code). All that said, I do my programming in OpenAI’s Codex app for Mac. It has completely dominated Claude Code for me. I’ll only ever use Opus to check 5.3-Codex’s work. Very weird world we’re living in. I hope it gets even weirder once Deepseek does whatever they’ve been cooking.

coolius|1 day ago

whatever they've been cooking at deepseek, i don't think i'm going to let their coding agent run shell stuff on my computer unless they make it free or something

XCSme|2 days ago

For coding, I agree, Codex-5.3 is the best out there.

But for the chat, I feel like ChatGPT got worse and worse.

ben_w|1 day ago

Something very weird is going on; I just tried a free trial of Codex-5.3, and a significant fraction of what it gives me doesn't even compile (or in the case of python, run without crashing).

Unless I specifically say "use git", it won't bother using git, apparently saying "configure AGENTS.md to us best practices" isn't enough for it to (at least in this case) use git. If this was isolated I might put that down to bad luck, given the nature of LLMs, but I have been finding Codex uses the wrong approaches all over the place, also stops in the middle of tasks, skips some tasks entirely (sometimes while marking them as done, other times it just doesn't get around to it).

I'd rank the output of Claude as similar to a junior with 1-3 years experience. It's not great, but it's certainly serviceable, a bit of tweaking even shippable. Codex… what I see is more like a student project. Or perhaps someone in the first month of their first job. Even the absolute worst human developers I've worked with after university weren't as bad as Codex, but several of them I'd rank worse than Claude.

jetbalsa|2 days ago

What where you using it for? claude is really good at agentic stuff, Pure coding, I can see codex being better, but for the entire workflow, I'm not sure

virgildotcodes|2 days ago

I use Codex purely for coding, and that's 90% of my use case for AI in general (10% using ChatGPT web for misc stuff). I pop out to Opus in Claude Code regularly to try to stay up on their relative performance, but so far the primary value I've been able to derive from CC is as a second set of eyes for code review / poking holes in plans. For primary planning / debugging / implementation Codex outclasses it atm sadly.

verst|2 days ago

I use Opus 4.6 Fast-mode. It produces significantly better results in my work than any Codex 5.3 tier.

YuriNiyazov|2 days ago

Me too. It's great that my employer pays for it and there's basically no budget, because this configuration is 10x more expensive than the regular default Sonnet.

virgildotcodes|2 days ago

Rapid iteration would possibly make up for the drop in quality, but I can't afford to use fast mode as I'm a contractor and pay for my own AI usage :(