Notice how pricing is the top discussion theme. People love free shit and it's hard to deny codex usage limits are more generous. My 2c for someone who uses both tools pretty consistently in an enterprise context:
- Codex-medium is better if you have a well articulated plan you "merely" need to execute on, need help finding a bug, have some specific complex piece of logic you need to tweak, truly need a ton of long range context to reason about an issue. It's great and usage limits are very generous!
- Sonnet 4.5 is better for everything else. That means for me: non-coding CLI ops, git ops, writing code with it as a pair programmer, OOD tasks, big new chunks of functionality that are highly conceptual, architectural discussion, etc. I generally approve every edit and often interrupt it. The fast iteration and feedback is key.
I probably use CC 80% of the time with Codex the other 20%. My company pays for CC and I don't even look at the cost. Most of my coworkers use CC over Codex. We do find the Codex PR reviewer to be the best of any tool out there.
Codex gets a lot of play on twitter also because a lot of the most prolific voices there are solo devs who are "building in public". A greenfield, solo project is the ideal (only?) use case for running 5 agents in parallel or whatever. Codex is probably amazing at that. But it's not practical for building in enterprise contexts IMO.
Interesting, my experience has been the opposite. I've been running Codex and Sonnet 4.5 side by side the past few weeks, and Codex gives me better results 90% of the time, pretty much across all tasks. Where Claude really shines is that it's much faster than codex. So if I know exactly what I want or if it's a simpler task I feel comfortable giving it to Claude because I don't want to wait for Codex to work through it. Claude cli is also a much better user experience than codex cli. But Codex gets complex things right more consistently.
They are similar enough that using one over the other is at most a small mistake. I prefer Claude models (perhaps I'm more used to them?) but Codex is also very good.
For larger tasks that I know are parallelizable, I just tell Claude to figure out which steps can be parallelized and then have it go nuts with sub-agents. I’ve had pretty good success with that.
> better for everything else. That means for me: non-coding CLI ops, git ops, writing code with it as a pair programmer, OOD tasks, big new chunks of functionality that are highly conceptual, architectural discussion..
I would argue this is the wrong way of using these tools. Writing out a defined plan in plain english and then have codex / claude write it out is better since that way we understand the intention. You can always have codex come up with an abstract plan first, iterate on it and then implement. Kind of like how we would implement software in real life.
In my experience gpt5-codex (medium) and codex-cli is notably better than Sonnet 4.5 and claude-code.
(note: never tried Opus)
It is slower, but the results are much more often correct and it doesn't rush into half-baked solutions/dumb approaches as eagerly.
I'd much rather wait 5 minutes than have to clean up manually or try to coax a model into doing things differently.
I also wouldn't be surprised if the slowness was partially due to OpenAI being quite resource constrained. They are repeatedly complaining about not having sufficient compute.
Bigger picture: I think all the AI coding environments are incredibly immature. There are many improvements to be unlocked.
Where codex falls short is in background processing, both running a daemon in the background and using its output as context while simultaneously being interactive for the user, and with subagents, ie, do multiple things in parallel. Presumably codex will catch up, but for now, that puts Claude Code ahead of things for me.
As far as which one is better, it's highly dependent on what we're each doing, but I will say that I have this one project where bare "make" won't work, and I have a script that needs to be run instead. I have instructions to call that script in multiple .md files, and codex is able to call the script instead of make, but it keeps forgetting that and tries to run make which fails and it gets confused. (Claude code running on macOS host but build on Linux vm.) I could work around it, but that really takes the "shiny" factor off of codex+GPT-5 for me.
Seems like HN is slowly split between “ai sucks” and everyone else who is slowly discovering what it can do, while Twitter is leagues ahead using other tools to build stuff.
I really like codex… but without the ability to launch sub-agents, it kinda struggles with context.
The biggest thing I use agents for is getting good search with less context.
Codex just struggles when the model needs to search too much because of this. Codex also struggles with too much context: there have been a number of times when it has just ran up on the context limit and couldn’t compact, so you just loose everything since your last message, which has been a lot of lost context/work for me.
I like each at different times in different ways. Now I have both running in separate Tmux panes and have one talk to the other to ask/delegate/verify/validate, using my Tmux-cli tool (now a Claude skill of course):
Now my work on a project often spans multiple sessions of these agents. So I use a session-finder and resume/dump tool (also in that repo). I often ask Claude or codex to extract all useful details from a jsonl session log file so I can continue the work.
For a good month I juggled between Claude Code and Codex CLI and found that Codex CLI did the job better. I recently ditched Claude Code and am currently only using Codex CLI.
Its interesting to me that Codex has such high sentiment. I'm definitely an outlier on the more principled end of the spectrum, but I refuse to use OpenAI products.
I take issue with the AI industry in general and the hand-wavy approach to risk, but OpenAI really is on another level in my book. While I don't trust the industry's approach to AI development, with OpenAI I don't trust the leaderships' intentions.
for me I don't understand codex the same way I don't understand gemini.
In my day to day tasks the only models that actually do what I want are the antrophic ones all other ones just fall flat on their face most of the time and end up creating more work than antrophic models.
I wonder if it's because I tend to abuse my models and constantly tell them that they're stupid
it has a fatal flaw: 80% of the screen is taken up by the code editor window and a file explorer. Why do you need to waste 80% of your screen on things you don't use anyway?
Regular codex user. Its my typing assistant. Allows me to be the ideas guy when writing software. Codex makes plenty of mistakes when generating large blocks of code but its easier to cleanup and consolidate with a refactoring pass once the typing had been done.
It looks like you have not reviewed r/ClaudeAI. This is a much larger subreddit and most of the posts are about Claude Code. Many comparisons of CC vs Codex.
This sub is full of "vibe coders" that use "prompt engineered" 1000 line prompts with 500 MCPs and then complain that they reach their limit in the first day while using the 200$ max plan.
Reading the comments and posts about both Claude Code and Codex on Reddit (and often hacker news), it’s hard to imagine they’re not extremely astroturfed.
There seems to be constant stream of not terribly interesting or unique “my Claude code/codex success story” blog posts that mange to solicit so many upvotes.
I think I'm partly responsible. I've been having a lot of fun with these tools, and so seeing other people doing the same just makes me want to engage even if the discussion isn't particularly sophisticated. I swear I'm not paid to do this (actually I pay out the wazoo for Claude..)
In life, it helps to be skeptical, so the real question is where do I find real life humans to ask about their experiences? And even then, they could still be paid actors. Though, I've often wondered how would that work. Like, the marketing department staffed by hot people finds developers and then offers to Venmo them $500 to write something nice online about the product? It's a big Internet, and there's a lot of people on Upwork, so I'm not saying it isn't happening, but I've never gotten an email asking me to write something nice about Claude Code in exchange for a couple of bucks.
Truthfully I find sonnet-4.5 better at Rust code than Codex (medium/high). Haven't tried anything else (like react/typescript) since I only use AI for issues/problems I don't understand.
my suspicion is that much (or at least some) of the negative sentiment towards claude code is from folks that were on it early (when code was even more widely used than codex) and created intensive workflows using it. when anthropic tightened quotas to make it more equitable across plan users they were much more likely to be impacted.
this is obviously pure conjecture, but perhaps the OE folks had automated their multiple roles and now they need to be more involved.
Meanwhile I am talking about unique shit with Claude Code trying to draft on that sentiment for little to no traction with them. We've built the best way to automate and manage production infrastructure using these models and no one gives a shit. It's so weird.
extr|4 months ago
- Codex-medium is better if you have a well articulated plan you "merely" need to execute on, need help finding a bug, have some specific complex piece of logic you need to tweak, truly need a ton of long range context to reason about an issue. It's great and usage limits are very generous!
- Sonnet 4.5 is better for everything else. That means for me: non-coding CLI ops, git ops, writing code with it as a pair programmer, OOD tasks, big new chunks of functionality that are highly conceptual, architectural discussion, etc. I generally approve every edit and often interrupt it. The fast iteration and feedback is key.
I probably use CC 80% of the time with Codex the other 20%. My company pays for CC and I don't even look at the cost. Most of my coworkers use CC over Codex. We do find the Codex PR reviewer to be the best of any tool out there.
Codex gets a lot of play on twitter also because a lot of the most prolific voices there are solo devs who are "building in public". A greenfield, solo project is the ideal (only?) use case for running 5 agents in parallel or whatever. Codex is probably amazing at that. But it's not practical for building in enterprise contexts IMO.
loveparade|4 months ago
qsort|4 months ago
quintu5|4 months ago
another_twist|4 months ago
I would argue this is the wrong way of using these tools. Writing out a defined plan in plain english and then have codex / claude write it out is better since that way we understand the intention. You can always have codex come up with an abstract plan first, iterate on it and then implement. Kind of like how we would implement software in real life.
the_duke|4 months ago
It is slower, but the results are much more often correct and it doesn't rush into half-baked solutions/dumb approaches as eagerly.
I'd much rather wait 5 minutes than have to clean up manually or try to coax a model into doing things differently.
I also wouldn't be surprised if the slowness was partially due to OpenAI being quite resource constrained. They are repeatedly complaining about not having sufficient compute.
Bigger picture: I think all the AI coding environments are incredibly immature. There are many improvements to be unlocked.
fragmede|4 months ago
As far as which one is better, it's highly dependent on what we're each doing, but I will say that I have this one project where bare "make" won't work, and I have a script that needs to be run instead. I have instructions to call that script in multiple .md files, and codex is able to call the script instead of make, but it keeps forgetting that and tries to run make which fails and it gets confused. (Claude code running on macOS host but build on Linux vm.) I could work around it, but that really takes the "shiny" factor off of codex+GPT-5 for me.
ripped_britches|4 months ago
Rather, the real reason codex takes longer is that it does more work to read more context.
IMO the results are much better with codex, not even close
aiisthefiture|4 months ago
radial_symmetry|4 months ago
visiondude|4 months ago
candiddevmike|4 months ago
aaronSong|4 months ago
spott|4 months ago
The biggest thing I use agents for is getting good search with less context.
Codex just struggles when the model needs to search too much because of this. Codex also struggles with too much context: there have been a number of times when it has just ran up on the context limit and couldn’t compact, so you just loose everything since your last message, which has been a lot of lost context/work for me.
d4rkp4ttern|4 months ago
https://github.com/pchalasani/claude-code-tools
Now my work on a project often spans multiple sessions of these agents. So I use a session-finder and resume/dump tool (also in that repo). I often ask Claude or codex to extract all useful details from a jsonl session log file so I can continue the work.
mpaepper|4 months ago
prameshbajra|4 months ago
_heimdall|4 months ago
I take issue with the AI industry in general and the hand-wavy approach to risk, but OpenAI really is on another level in my book. While I don't trust the industry's approach to AI development, with OpenAI I don't trust the leaderships' intentions.
gorjusborg|4 months ago
Me too, so much so that I doubt this is legitimate. This blog post is the only place I've seen people 'raving' about codex.
Claude Code is the current standard all others are measured against.
kachapopopow|4 months ago
In my day to day tasks the only models that actually do what I want are the antrophic ones all other ones just fall flat on their face most of the time and end up creating more work than antrophic models.
I wonder if it's because I tend to abuse my models and constantly tell them that they're stupid
projektfu|4 months ago
drooby|4 months ago
omgitspavel|4 months ago
another_twist|4 months ago
unknown|4 months ago
[deleted]
atlgator|4 months ago
sunaookami|4 months ago
velcrovan|4 months ago
k__|4 months ago
daliusd|4 months ago
mikeocool|4 months ago
There seems to be constant stream of not terribly interesting or unique “my Claude code/codex success story” blog posts that mange to solicit so many upvotes.
nl|4 months ago
I've been coding for 30 years.
Using Codex I'm finally enjoying it again for the first time in maybe 15 years. Outsource all that annoying part? Heck yeah - bring it on.
And I tell everyone I can how transformational it has been for me.
resonious|4 months ago
fragmede|4 months ago
eddiewithzato|4 months ago
konishipolis|4 months ago
this is obviously pure conjecture, but perhaps the OE folks had automated their multiple roles and now they need to be more involved.
unknown|4 months ago
[deleted]
camel-cdr|4 months ago
nickstinemates|4 months ago
frtime38|4 months ago
[deleted]
kazinator|4 months ago