top | item 46928225

(no title)

niobe | 22 days ago

So fast mode uses more tokens, in direct opposition to Gemini where fast 'mode' means less. One more piece of useless knowledge to remember.

discuss

order

Sol-|22 days ago

I don't think this is the case, according to the docs, right? The effort level will use fewer tokens, but the independent fast mode just somehow seems to use some higher priority infrastructure to serve your requests.

Aurornis|22 days ago

You're comparing two different things. It's not useless knowledge, it's something you need to understand.

Opus fast mode is routed to different servers with different tuning that prioritizes individual response throughput. Same model served differently. Same response, just delivered faster.

The Gemini fast mode is a different model (most likely) with different levels of thinking applied. Very different response.