(this is why Opus 4.6 is worth the price -- turning off thinking makes it 3x-5x faster but it loses only a small amount of intelligence. nobody else has figured that out yet)
You can turn off thinking in Gemini pro models by using completion mode.
Essentially, append a message with role=model and minimal text part, such as a simple "A", at the end of the "contents" array. The model will try to complete the message without using any thought tokens.
You can also set the model message to start with "think" or something along that line and watch it thinks out loud (or melts down with over-thinking and stop due to reaching maximum output token)
What about for analysis/planning? Honestly I've been using thinking, but if I don't have to with Opus 4.6 I'm totally keen to turn it off. Faster is better.
iCarrot|10 days ago
Essentially, append a message with role=model and minimal text part, such as a simple "A", at the end of the "contents" array. The model will try to complete the message without using any thought tokens.
You can also set the model message to start with "think" or something along that line and watch it thinks out loud (or melts down with over-thinking and stop due to reaching maximum output token)
``` [ { "parts": [{"text": "hello"}], "role": "user" }, { "parts": [{"text": "*think"}], "role": "model" } ] ```
jbellis|3 days ago
sunaookami|10 days ago
girvo|10 days ago