top | item 45648980

(no title)

mordymoop | 4 months ago

I'm on the same page here. I have seen this sentiment about Codex suddenly being good a few times now, so I booted Codex CLI thinking-high back up after a break and asked it to look for bugs. It promptly found five bugs that didn't actually exist. It was the kind of truly impressively stupid mistake that I haven't seen Claude Code make essentially ever, and made me wonder if this isn't the sort of thing that's making people downplay the power of LLMs for agentic coding.

discuss

order

stavros|4 months ago

I asked Sonnet 4.5 to find bugs in the code, it found five high-impact bugs that, when I prompted it a second time, it admitted weren't actually bugs. It's definitely not just Codex.

throwaway-0001|4 months ago

In my case codex fixed a bug in one shot. Took 10 min to debug and find it.

Claude struggled long time and still didn’t find.