(no title)
Rperry2174 | 25 days ago
With Codex (5.3), the framing is an interactive collaborator: you steer it mid-execution, stay in the loop, course-correct as it works.
With Opus 4.6, the emphasis is the opposite: a more autonomous, agentic, thoughtful system that plans deeply, runs longer, and asks less of the human.
that feels like a reflection of a real split in how people think llm-based coding should work...
some want tight human-in-the-loop control and others want to delegate whole chunks of work and review the result
Interested to see if we eventually see models optimize for those two philosophies and 3rd, 4th, 5th philosophies that will emerge in the coming years.
Maybe it will be less about benchmarks and more about different ideas of what working-with-ai means
karmasimida|25 days ago
> With Opus 4.6, the emphasis is the opposite: a more autonomous, agentic, thoughtful system that plans deeply, runs longer, and asks less of the human.
Ain't the UX is the exact opposite? Codex thinks much longer before gives you back the answer.
xd1936|25 days ago
WilcoKruijer|25 days ago
cwyers|24 days ago
bt1a|25 days ago
ghosty141|25 days ago
Having a human in the loop eliminates all the problems that LLMs have and continously reviewing small'ish chunks of code works really well from my experience.
It saves so much time having Codex do all the plumbing so you can focus on the actual "core" part of a feature.
LLMs still (and I doubt that changes) can't think and generalize. If I tell Codex to implement 3 features he won't stop and find a general solution that unifies them unless explicitly told to. This makes it kinda pointless for the "full autonomy" approach since effecitly code quality and abstractions completely go down the drain over time. That's fine if it's just prototyping or "throwaway" scripts but for bigger codebases where longevity matters it's a dealbreaker.
_zoltan_|25 days ago
xXSLAYERXx|24 days ago
vidarh|24 days ago
NuclearPM|24 days ago
That could easily be automated.
Skidaddle|24 days ago
sejje|24 days ago
utilize1808|25 days ago
Rperry2174|25 days ago
specifically, the GPT-5.3 post explicitly leans into "interactive collaborator" langauge and steering mid execution
OpenAI post: "Much like a colleague, you can steer and interact with GPT-5.3-Codex while it’s working, without losing context."
OpenAI post: "Instead of waiting for a final output, you can interact in real time—ask questions, discuss approaches, and steer toward the solution"
Claude post: "Claude Opus 4.6 is designed for longer-running, agentic work — planning complex tasks more carefully and executing them with less back-and-forth from the user."
mcintyre1994|25 days ago
I haven’t used Codex but use Claude Code, and the way people (before today) described Codex to me was like how you’re describing Opus 4.6
So it sounds like they’re converging toward “both these approaches are useful at different times” potentially? And neither want people who prefer one way of working to be locked to the other’s model.
giancarlostoro|25 days ago
This feels wrong, I can't comment on Codex, but Claude will prompt you and ask you before changing files, even when I run it in dangerous mode on Zed, I can still review all the diffs and undo them, or you know, tell it what to change. If you're worried about it making too many decisions, you can pre-prompt Claude Code (via .claude/instructions.md) and instruct it to always ask follow up questions regarding architectural decisions.
Sometimes I go out of my way to tell Claude DO NOT ASK ME FOR FOLLOW UPS JUST DO THE THING.
Rperry2174|25 days ago
I guess its also quite interesting that how they are framing these projects are opposite from how people currently perceive them and I guess that may be a conscious choice...
jhancock|24 days ago
I usually want the codex approach for code/product "shaping" iteratively with the ai.
Once things are shaped and common "scaling patterns" are well established, then for things like adding a front end (which is constantly changing, more views) then letting the autonomous approach run wild can *sometimes* be useful.
I have found that codex is better at remembering when I ask to not get carried away...whereas claude requires constant reminders.
techbro_1a|25 days ago
This is true, but I find that Codex thinks more than Opus. That's why 5.2 Codex was more reliable than Opus 4.5
dimgl|24 days ago
aurareturn|24 days ago
bob1029|25 days ago
I would much rather work with things like the Chat Completion API than any frameworks that compose over it. I want total control over how tool calling and error handling works. I've got concerns specific to my business/product/customer that couldn't possibly have been considered as part of these frameworks.
Whether or not a human needs to be tightly looped in could vary wildly depending on the specific part of the business you are dealing with. Having a purpose-built agent that understands where additional verification needs to occur (and not occur) can give you the best of both worlds.
aulin|24 days ago
F7F7F7|24 days ago
I’ve had both $200 plans and now just have Max x20 and use the $20 ChatGPT plan for an inferior Codex.
My experience (up until today) has always been that Codex acts like that one Sr Engineer that we all know. They are kind of a dick. And will disappear into a dark hole and emerge with a circle when you asked for a pentagon. Then let you know why edges are bad for you.
And yes, Anthropic is pivoting hard into everything agentic. I bet it’s not too long before Claude Code stops differentiating models. I had Opus blow 750k tokens on a single small task.
cchance|25 days ago
Theres hundreds of people who upload Codex 5.2 running for hours unattended and coming back with full commits
mdale|24 days ago
hbarka|25 days ago
iranintoavan|25 days ago
sfmike|24 days ago
dboon|24 days ago
blurbleblurble|25 days ago
pyrolistical|25 days ago
mi_lk|23 days ago
rippeltippel|24 days ago
rozumbrada|25 days ago
drsalt|24 days ago
d--b|25 days ago
I mean Opus asks a lot if he should run things, and each time you can tell it to change. And if that's not enough you can always press esc to interrupt.
adarsh2321|24 days ago
[deleted]