(no title)
zurfer
|
2 months ago
It's a cool release, but if someone on the google team reads that:
flash 2.5 is awesome in terms of latency and total response time without reasoning. In quick tests this model seems to be 2x slower. So for certain use cases like quick one-token classification flash 2.5 is still the better model.
Please don't stop optimizing for that!
edvinasbartkus|2 months ago
thinkingConfig: { thinkingLevel: "low", }
More about it here https://ai.google.dev/gemini-api/docs/gemini-3#new_api_featu...
zurfer|2 months ago
On that note it would be nice to get these benchmark numbers based on the different reasoning settings.
retropragma|2 months ago
Tiberium|2 months ago
andai|2 months ago
https://ai.google.dev/gemini-api/docs/thinking#levels
bobviolier|2 months ago