top | item 44806494

(no title)

OpenAI also has Flex processing[1] for o3. I've spent most of my time with Gemini 2.5, but lately been trying out a ton of o3 as it seems to work quite well and I get really cheap tokens (~95% of my agentic tokens are cached which is 75% discount and flex mode adds 50% for $0.25 / million input tokens)

[1] https://platform.openai.com/docs/guides/flex-processing?api-...

discuss

esafak|6 months ago

Which agents support flex mode?

bayesianbot|6 months ago

I've made my own fork of Codex that always uses flex, or you can route agents through litellm and make it add the service_tier parameter. I haven't really seen native support for it anywhere.