top | item 43341298

(no title)

alekandreev | 11 months ago

That's an idea we've thought about. However, we think the open source community has already created a very impressive set of language or region-specific finetunes [1] [2]. Also there is a lot of cultural and nuance context in every language that we don't have the capacity to cover sufficiently. So for v3 we focused on creating the best foundational multilingual model.

[1] https://huggingface.co/aiplanet/buddhi-indic

[2] https://ai.google.dev/gemma/gemmaverse/sealion

discuss

order

jjani|11 months ago

Just wanted to say that Gemini 1.5-Pro is still the SOTA foundational model for certain languages (including non-Google models), so it's disappointing to have received the email that it will be removed in September - it will cause our product quality to go backwards when we're forced to replace it by a worse model. Unless a better one appears in that time, but we've extensively tested all big models and for the languages in question, none of them perform on the same level.

Happy to elaborate if there's a way to get in touch, in case the team isn't aware of this.

mdp2021|11 months ago

And have you measured the trade-off that could come with embracing such a large number of languages and alphabets? It would be interesting to note whether you are sacrificing some response quality, or if such supposed sacrifice is interestingly negligible, or if - even more interestingly - the quality increases with the added proficiency.

alekandreev|11 months ago

Yes we have measured the tradeoff. We don't see a drop of perplexity in English when introducing multilingual, and there is a slight drop in some English language-specific evals (~1%).

Workaccount2|11 months ago

There are enough small model teams competing that I fell confident one of them will try this, and if it just sticking to english gives a large boost, the others will be forced to follow suite.

It would also kind of suck for non-english speakers, because it will just be another feather in the hat of "English eats the world".