top | item 46131113

(no title)

This is probably not a core concern for most HN readers, but at work we do multilingual testing for synthetic text data generation and natural language processing. Emphasis on multilingual. Gemini has made some serious leaps from 1.5 to 2.5 and now 3.0, and is actually proficient in languages that other models can only dream of. On the other hand, GPT-5 has a really mixed performance in a lot of categories.

discuss

deaux|2 months ago

This goes way back. Even back in the 1.5 days it was the best multilingual model, when HN still treated it as entirely uncompetitive all-around. Just because, exactly as you're saying, it's not a core concern of people here. The two fields Gemini models have been number one at for years now are A. multilinguality B. image understanding. At no point since the release of Gemini 1.5 Pro way back has any Anthropic or OpenAI model done performed better at either.

Even those who have zero experience with different (human) languages could've known this if they liked, from the fact that on the LMArena leaderboards, Gemini models have consistently ranked much higher in non-English languages than in English. This gap has actually shrunk a lot over time! In the 1.5 Pro days this advantage was huge, it would be like 10th in English and 2nd in many other languages.

Nevertheless, it still depends on the specific language you're targeting. Gemini isn't the winner on every single one of them. If you're only going to choose one model for use with many languages, it should be Gemini. But if the set of languages isn't too large, optimizing model selection per language is worth it.

curioussquirrel|2 months ago

In our previous tests, when it was 1.5 Pro against GPT 4o and Claude Sonnet 3.7, Gemini wasn't winning in the multilingual race, but it was definitely competitive. 2.5 and 3.0 seems to be big leaps from the 1.5 days. That said, it also depends on the testing methodology; we tested a bunch of use cases mostly to test core linguistic proficiency. Not as much complex tasks in language or cultural knowledge.

curioussquirrel|2 months ago

And regarding specific models - we obviously only tested a few languages, and there are thousands of them in the world. But Gemini seems to lead the pack basically regardless of the language your throw at it. YMMV.

jimmydoe|2 months ago

Very good to know. I use Gemini for many translation related work, the 1m windows is very helpful too.