top | item 47134656

(no title)

krige | 5 days ago

Context dependant phrases, from simple pronouns to whole domain specific terms, are still randomly wrong, sometimes appallingly so. Hallucinations still happen. Auto-AI translation youtube uses is, bluntly, horrid. Any jokes, even obvious ones, are still fumbled frequently.

LLM based translation looks more convincing but requires the same level of scrutiny that previous tools did. From a workflow POV they only added higher compute costs for very questionable gains.

discuss

andy12_|5 days ago

> Auto-AI translation youtube uses is, bluntly, horrid. Any jokes, even obvious ones, are still fumbled frequently.

Youtube auto-translations are horrible indeed, and I say that as someone that has to live with the fact that Youtube decides to badly translate titles from a language I understad to Spanish because bilingual people don't exist I suppose. But that is because they use some dumb cheap model to make the translations; probably not even a Gemini-based model.

TeMPOraL|5 days ago

> Hallucinations still happen. Auto-AI translation youtube uses is, bluntly, horrid. Any jokes, even obvious ones, are still fumbled frequently.

I've seen that too, but these were all dedicated translation tools and auto-translate functionality.

My benchmark is against SOTA LLMs used directly. I.e. I copy the text (or media) in question, paste directly to ChatGPT or Gemini (using the best model on basic paid tier), and ask for translation. Not always perfect, but nearly so - and they naturally ingest additional context if available - such as the surrounding text, or title/ID/URI of the document/website you're looking at, or additional explanations in the prompt - and make very good use of it. This has always been missing in dedicated tools, historically built around the mistaken assumption that translation is merely a function of input text and pair of language designators (from, to). The shorter the input, the more apparent it becomes how much context matters.

RE YouTube and such - or, like any auto-transcription in video calls I've seen - I can't explain that by anything other than service providers cheapening out on this.

> From a workflow POV they only added higher compute costs for very questionable gains.

Regarding the costs - I imagine they may be an issue at scale, but for regular use (on-demand translation of individual passages, documents, recordings), it feels like it shouldn't be that noticeable anymore. You don't need to run GPT-5 for everything, some models you can run client-side already seem decent enough, and they keep improving.

> LLM based translation looks more convincing but requires the same level of scrutiny that previous tools did.

That's fair. Ultimately, if you don't know both languages, you can only trust the translation as much as you trust the translator (human or otherwise). We'll have to get a feel for this as much as we did with Google Translate, et al. In my experience, whenever I can verify them, results from LLMs are already vastly superior to prior art.

Tangent, and why I started considering LLMs as solving universal translation in the first place: 6 months ago, when I needed to talk with someone with whom I had zero language overlap, I tried several well-known translation apps (notably Google and Samsung), and none could manage - but then, on a whim, I just asked ChatGPT (in "advanced voice" mode) to "play a game" where it listens in and repeats whatever was just said in language A, but translated to language B, and vice versa -- and it worked flawlessly on first try.