top | item 38990148

Ask HN: AI that allows you to make phone calls in a language you don't speak?

22 points| VictorPenJust | 2 years ago

Imagine this: You're trying to book a table at a sushi restaurant in Tokyo over the phone, but you don’t speak Japanese. With this hypothetical software, you could make the call in English, and the software would translate and synthesize your voice into Japanese in real-time. So, you would speak in English, and the restaurant would hear everything in Japanese.

This idea came to me during my travels in Japan and China, where English is not commonly spoken in many places. It's incredibly challenging to navigate without knowing the language. We often had to rely on our hotel's front desk to assist us with reservations and contacting support services.

I can envision other applications for this technology, such as in call centers, for international business calls, meetings, etc.

What do you guys think? Thanks in advance!

24 comments

order

drtgh|2 years ago

About six years ago I saw a tourist ordering in a restaurant by writing on his smartphone, translating, and using a classic text-to-speech function while showing the screen to the waitress, and it worked him pretty well for ordering. Since then the people I saw using it increased. Common sentences are used.

The problem would be in a conversation between two persons. Nowadays automated text translations are not reliable, can introduce even opposite meanings and they are not aware of nuances; it needs active supervision at this moment (and the following years).

With voice, a time-delay is needed for to acquire sentence context if the sources and target languages share structures, or a mandatory time-delay when the language structures are different, and also a general time-delay would be recommended for to avoid the interlocutors to listen two voices at same time. I'm not sure the real-time can be done like the ones we see in Star Trek (with voice to voice at least).

Important note: I would not recommend to popularize the synthesis of our personal voices. Variations from some reference models would be much better.

dustincoates|2 years ago

Samsung has already announced that live translation of calls will be coming to their next phones:

> AI Live Translate Call will soon give users with the latest Galaxy AI phone a personal translator whenever they need it. Because it’s integrated into the native call feature, the hassle of having to use third-party apps is gone. Audio and text translations will appear in real-time as you speak, making calling someone who speaks another language about as simple as turning on closed captions when you stream a show. Because it’s on-device Galaxy AI, you can trust that no matter the scenario, private conversations never leave your phone.

https://news.samsung.com/global/a-new-era-of-galaxy-ai-is-co...

jason-phillips|2 years ago

Knowing Samsung software, I would temper expectations a little.

rantallion|2 years ago

> and the software would translate ... in real-time

Depending on the pair of languages being translated between, isn't this literally impossible? The ordering of sentence parts is not the same in all languages (coincidentally, Japanese and English are a perfect example here of how different grammar can be), so you often have to wait until you've heard the whole sentence before you can parse it translate into another language.

Given the above issue, how is what you're envisioning any better than just using Google Translate?

LargeTomato|2 years ago

In Arabic there is no word for "uncle". There is only "dad's brother" and "mom's brother". Often times an English speaker will leave out the paternal/maternal portion and the Arabic sentence may be ambiguous. If the English speaker later specified the needed information and the Arabic translator guessed wrong then there's a confusing problem here and the Arabic listener might get confused.

nrmitchi|2 years ago

This is the kind of failure of expectations that Star Trek’s universal translator has set us up for.

As far as I know you’re right, a number of things either don’t translate, or are too contextual for real time translation.

I’d love to be proven wrong, but I think the underlying problem here isn’t necessarily the language itself, but differences in underlying mental models that the languages express.

bdhcuidbebe|2 years ago

> This idea came to me during my travels in Japan and China, where English is not commonly spoken in many places.

I got the same idea, but from Douglas Adams ;)

gorbypark|2 years ago

I’ve been keeping my eye on Seamless M4T streaming project from Meta. Although I haven’t gotten it to run locally yet (mostly due to lack of time), I think it has the potential to allow things like real time phone calls. My end goal is to have system level real-time translated transcriptions (for video conferences, etc).

sargstuff|2 years ago

not a wicked googly jimmy cricket concept. web search engine terms lilliputing "pocket translater" shows quite a few options.

  https://itranslate.com/features/camera-translations

  https://blog.google/products/translate/see-world-in-your-lan...

  https://translate.google.com/about/

behnamoh|2 years ago

Didn't Google showcase this feature a couple years ago? ofc it's Google, so who knows if it ever got into production.

agilob|2 years ago

They did during Google I/O in like 2018. Voice AI was supposed to make calls and book meetings to a hairdresser. They never released it. It was there only to bump stock prices

CodeNest|2 years ago

Samsung will beat the Apple. good for them.

barbariangrunge|2 years ago

This will be the worst. We get enough spam calls as it is. Soon we’ll stop being able to use “broken English” as a clue that it’s a scam and even hearing our parents voices won’t be evidence it’s a real call

pixl97|2 years ago

Just look at it like the dam has already broke, but the wave of destruction isn't here yet. Also, the voice will sound like your grandparents or uncle, probably because they used some malware app on their phone that stole their voice imprint and sold it off.

karmakaze|2 years ago

I was about to comment that this isn't AI, then realized that it's a pointless distinction now. Whether something is AGI is still meaningful to know when we got there, until then all AI/ML-tech will simply either keep being called AI or computer, agent/assistent, or whatever.