top | item 45648980

(no title)

mordymoop | 4 months ago

I'm on the same page here. I have seen this sentiment about Codex suddenly being good a few times now, so I booted Codex CLI thinking-high back up after a break and asked it to look for bugs. It promptly found five bugs that didn't actually exist. It was the kind of truly impressively stupid mistake that I haven't seen Claude Code make essentially ever, and made me wonder if this isn't the sort of thing that's making people downplay the power of LLMs for agentic coding.

discuss

stavros|4 months ago

I asked Sonnet 4.5 to find bugs in the code, it found five high-impact bugs that, when I prompted it a second time, it admitted weren't actually bugs. It's definitely not just Codex.

throwaway-0001|4 months ago

In my case codex fixed a bug in one shot. Took 10 min to debug and find it.

Claude struggled long time and still didn’t find.