Good work. I've often found llm's to be "stupider" when speaking Norwegian than when speaking English, so it's not surprising to find they hallucinate more and can't stick to their instructions in other non-English languages.
Do you think there would be value in a workflow that translates all non-English input to English first, then evaluates it, and translates back as needed?
Personally, I don't bother prompting LLMs in Japanese, AT ALL, since I'm functional enough in English(a low bar apparent from my comment history) and because they behave a lot stupider otherwise. The Japanese language is always the extreme example for everything, but yes, it would be believable to me if merely normalizing input by first translating just worked.
What would be interesting then would be to find out what the composite function of translator + executor LLMs would look like. These behaviors makes me wonder, maybe modern transformer LLMs are actually ELMs, English Language Models. Because otherwise there'll be, like, dozens of functional 100% pure French trained LLMs, and there aren't.
LLM's tend to "average out" language, making it less nuanced and more predictable. Combined with outright mistranslations, I don't think it'd perform better than what "reasoning mode" already does.
turnsout|10 days ago
numpad0|10 days ago
What would be interesting then would be to find out what the composite function of translator + executor LLMs would look like. These behaviors makes me wonder, maybe modern transformer LLMs are actually ELMs, English Language Models. Because otherwise there'll be, like, dozens of functional 100% pure French trained LLMs, and there aren't.
pjc50|10 days ago
internet_points|6 days ago
faeyanpiraat|10 days ago
there must be a ranking of languages by "safety"