top | item 44301150

(no title)

hs86 | 8 months ago

I am always disappointed when I compare the answers to the same queries on 2.5 Pro vs. o4-mini/o3. But trying out the same query in AI Studio gives much better results, closer to OpenAI's models. What is wrong with 2.5 Pro in the Gemini app? I can't believe that the model in their consumer app would produce the same benchmark results as 2.5 Pro in the API or AI Studio.

discuss

thimabi|8 months ago

The models in the Gemini app are nerfed in comparison to those in AI Studio: they have less thinking budget, output less tokens, and have various safety filters. There’s certainly a trade-off between using AI Studio for its better performance and using the API or the Gemini app in a way that doesn’t involve Google keeping your data for training purposes.

mh-|8 months ago

I don't have any inside information, but I'm sure there are different system prompts used in the Gemini chat interface vs the API. On OpenAI/ChatGPT they're sometimes dramatically different.