top | item 44849461

(no title)

API is noticeably slower for me, sometimes up to 10x slower.

Upon some digging, it seems that part of the slowdown is due to the gpt-5 models by default doing some reasoning (reasoning effort "medium"), even for the nano or mini model. Setting the reasoning effort to "minimal" improves the speed a lot.

However, to be able to set the reasoning effort you have to switch to the new Response API, which wasn't a lot of work, but more than just changing a URL.

discuss

Tiberium|6 months ago

> However, to be able to set the reasoning effort you have to switch to the new Response API, which wasn't a lot of work, but more than just changing a URL.

That's not true - you can switch reasoning effort in the Chat Completions API - https://platform.openai.com/docs/api-reference/chat/create . It's just that in Chat Completions API it's a parameter called "reasoning_effort", while in the Responses API it's a "reasoning" parameter (object) with a parameter "effort" inside.

pseudo_meta|6 months ago

Oh thx, must have missed that. Guess at least that saves me some time to switch to the newer API in the future.