top | item 44064727

(no title)

drewnick | 9 months ago

I just had some issue with RLS/schema/postgres stuff. Gemini 2.5 Pro swung and missed, and talked a lot with little code, Claude Sonnet 4 solved. O1 Pro Solved. It's definitely random which of these models can solve various problems with the same prompt.

discuss

diggan|9 months ago

> definitely random which of these models can solve various problems with the same prompt.

Yeah, this is borderline my feeling too. Kicking off Codex with the same prompt but four times sometimes leads to for very different but confident solutions. Same when using the chat interfaces, although it seems like Sonnet 3.7 with thinking and o1 Pro Mode is a lot more consistent than any Gemini model I've tried.