top | item 36636369

(no title)

We’ve been testing the upgraded models in the API (where you can control when the upgrade happens), and the newer ones perform significantly worse than the older ones on the same tasks. Tweaking the prompts helps some but not enough. We’re staying on the older models for now in production.

Hope OpenAI figures this out because quality has been their biggest moat up until now.

discuss

No comments yet.