ChatGPT is worse in Russian. Example: after accurately noting that a name appeared in a particular Russian book, it asked if I wanted the direct quote in Russian. I said yes. At this point it switched to Russian output but could no longer find the name in that book, and then apologized for having used what seemed to have been "approximations" about the book before.
(I did then go and check the book myself; ChatGPT in English was right, the name is there)
I was using Qwen3 locally in thinking mode, and noted that even if it is talking to me in Japanese, it is doing it's "thinking" steps in English. Not having a full understanding of how the layers in an LLM handle language connections I can't say for sure, but for a human this would result in subpar outcomes.
For example (not actual output):
Input: "こにちは"(konichwa)
Qwen Thinking: "Ah, the user has said "こにちは", I should respond in a kind and friendly manner.
Qwen Output: こにちは!
It quiiiickly gets confused in this, much quicker than in English.
I'm kind of wondering when will it become a universal understanding that LLMs can't be trained with equal amounts of Japanese and Chinese contents in training data due to Han Unification, making these two languages incoherent mix of two conflicting syntax in one. It's remarkable that Latin languages is not apparently facing issues without clear technical explanation as to why, which I'm guessing has to do with the fact of granularity of characters.
That said, in my tiny experience, LLMs all think in their dataset majority language. They don't adhere to prompt languages, one way or another. Chinese models usually think in either English or Chinese, rarely in cursed mix thereof, and never in Japanese or any of their non-native languages.
I don’t think this can be solved until there is massive investment to train LLM in native Japanese. The current ChatGPT tokenizer still use BPE and you can’t even present a Japanese character with a single token
Quite a few reasoning LLMs do reasoning in English only. Because the RL setup specifically forces them to do so.
Why?
Because the creators want the reasoning trace to be human readable. And without a pressure forcing them to think in English, they tend to get weird with the reasoning trace. Wild language-mixing, devolved grammar, strange language-mixed nonsense words that the LLM itself seemingly understands just fine.
ehnto|5 months ago
For example (not actual output):
Input: "こにちは"(konichwa) Qwen Thinking: "Ah, the user has said "こにちは", I should respond in a kind and friendly manner.
Qwen Output: こにちは!
It quiiiickly gets confused in this, much quicker than in English.
numpad0|5 months ago
That said, in my tiny experience, LLMs all think in their dataset majority language. They don't adhere to prompt languages, one way or another. Chinese models usually think in either English or Chinese, rarely in cursed mix thereof, and never in Japanese or any of their non-native languages.
charlieyu1|5 months ago
lmm|5 months ago
ACCount37|5 months ago
Why?
Because the creators want the reasoning trace to be human readable. And without a pressure forcing them to think in English, they tend to get weird with the reasoning trace. Wild language-mixing, devolved grammar, strange language-mixed nonsense words that the LLM itself seemingly understands just fine.