Google Turns Your Android Phone Into An On-The-Fly Conversation Interpreter

[+] ghshephard|15 years ago|reply

This is almost moving Kurzweil's prediction from "mostly correct" to "correct".

"Early 2000s

    * Translating telephones allow people to speak to 
      each other in different languages.
    * Machines designed to transcribe speech into 
      computer text allow deaf people to understand 
      spoken words."

Every time he notches another victory, I pay closer and closer attention to his other guesses that are 10 - 30 years out.

[+] feral|15 years ago|reply

I don't buy that Kurzweil prediction accuracy stuff.

There's just so much wiggle room when you start allowing categorisations like 'mostly correct', and the sort of equivocation that seems to be going on. I thought his recent evaluation of predictions was weak.

First off, this particular technology could probably have been built, very badly, in the early 90s. It would have had accuracy too low to make it useful. Its cool that google are doing this now, certainly, but the questions on accuracy and speed need to be answered. I'm sure they'll do a good job, but my point is that there is a big difference than when the technology is first prototyped in a lab, in some form, and when its mature enough for wide spread consumption. The predictions really need to say which they are referring to to be meaningful - a big problem with the recent evaluation.

Further, if your thesis is that technological capacity grows exponentially, like moores law, then 8/10 years wrong is pretty wrong.

Finally, I'll just say, that the charts of progress and development in the book 'The Singlularity Is Near' are hard to take seriously. Subjective technological milestones are arbitrarily chosen, sometimes over millennial timespans. These are plotted log-log, lines are fitted and inferences drawn that make predictions on for the next 30 years. What's the margin of error to a methodology like that?

I'm not disagreeing with any particular thesis; his writing certain stirs healthy debate on important ideas, and its certainly worth pointing out to a whole lot of people that technological growth is non-linear.

But beyond that, I'm not paying too much attention to the future predictions.

[+] billpaetzke|15 years ago|reply

Predictions made by Ray Kurzweil:

http://en.wikipedia.org/wiki/Predictions_made_by_Ray_Kurzwei...

[+] BRadmin|15 years ago|reply

Screenings for a documentary about him, Transcendent Man, were finally posted yesterday:

http://transcendentman.com/

[+] abecedarius|15 years ago|reply

How well does this system handle your second bullet point (transcription for conversation with the deaf)? I have a use for that but, not having an Android phone yet, I can't just try it out.

[+] nazgulnarsil|15 years ago|reply

I was forced to revise my estimate in kurzweil's direction of 2040 for AI/whole brain emulation after seeing AIXI-MC, and the memristor brain project being funded by DARPA. My original estimate was more like 2080.

[+] michaelbuckbee|15 years ago|reply

What's most interesting to me is that Google seems to be moving forward with a strategy of competing with iOS via services instead of just applications.

Services (like Conversation Mode translate, constantly updated turn by turn GPS driving directions, more deeply integrated Google Talks) give them leverage against the carriers, who have to meet Google guidelines for Android in order to include the flagship apps and represent a really high bar that Apple would have to overcome to compete.

I realize it isn't exactly as cut and dried as services vs applications, but it is certainly a strong move that plays to Google's strengths.

[+] nostrademons|15 years ago|reply

I suspect that mobile in general will move towards services instead of applications. There's a lot of data crunching that you can do when you have a data collection device in every person's pockets, but doing that crunching on the client will drain your batteries it to time. I predict that the really interesting mobile apps will have a thin (but native) client that only does data collection & UI, and then the most computationally interesting piece will be hosted in the cloud somewhere.

[+] Andrenid|15 years ago|reply

In the last 6-12 months alone i'm really starting to see "the future" that I dreamed of as a kid.

Between this article, http://questvisual.com/, Microsoft Kinect, Nintendo 3DS, Microsoft Surface 2 (first version didn't impress me, was so huge and clunky), Amazon's Kindle 3 (huge fan of HHGTTG, the Kindle basically IS the guide) etc...

As a life-long nerd/geek, it's pretty awe-inspiring.

[+] enjo|15 years ago|reply

I'll second this.

The Kinect, in particular, is my invention of the year. I've had it for two months now and the freaking menu in dance central STILL doesn't get old. Just moving through that menu is the single most tactile thing I've ever done when interacting with a machine.

I LOVE it... and I can't wait to see how that technology grows and changes in the coming years.

[+] othello|15 years ago|reply

I would also add to that list the Emotiv kit (http://www.emotiv.com), which allows "to create applications that can be controlled by your mind" by making use of the same type of technology that allows paralytics to control robotic arms.

Actually, the technology alone is highly impressive, but that an SDK is available to anyone for $299 is nothing short of mind-boggling.

[+] Estragon|15 years ago|reply

Actually, a smart phone with offline wikipedia is a much closer approximation to The Guide than the kindle.

[+] anigbrowl|15 years ago|reply

What age are you, out of curiosity?

[+] kenjackson|15 years ago|reply

Take this and Word Lens like capability -- if I was in junior high, I'd make the argument that there's no need for me to learn a foreign language. In a few years, I can speak and write any language there is!

[+] alextp|15 years ago|reply

This might save you some time/money, but it certainly can't bridge the gap between different languages. Keep in mind that "to translate is to betray" http://en.wikipedia.org/wiki/Traduttore,_traditore

[+] ximeng|15 years ago|reply

Imagine having a real-time interpreted conversation like they do in this Microsoft video:

http://zekeweeks.com/2010/03/03/real-life-babelfish-the-tran...

The delay and imperfections would mean you communicate less than half the speed you would if you spoke the language natively. That's going to get annoying quickly.

[+] lukev|15 years ago|reply

Automated translation can get 90% of the way there, and we're very close to this point already. But for that final 10%, you basically need full AI that's capable of actually comprehending the subject matter in order to provide an adequate translation.

I think that's probably more than a couple decades out.

[+] jokermatt999|15 years ago|reply

There's still nuances and subtleties that will be lost in machine translation (I'd assume it would still strong AI to have otherwise), so there is still plenty reason to fully learn a language. However, learning a language at the junior high/high school level of two years or so of minimal study probably can be replaced by machine translation in terms of being able to communicate basic thoughts.

[+] ghshephard|15 years ago|reply

Agreed. It will take a while to have "offline" translators available, but, if we can extend Moore's Law and increasing storage densities - then I'll conservatively predict that in 20 years, we'll have a pretty close to flawless babel fish like technology available, and that's just so we have five years to move the "online translation" technology to a local device. There are a number of research projects that are starting to make progress in this area.

2015 We will have universal dictionary lookup capability for most languages, and will have word-word translation in spoken, clearly written (handwriting AI is a long ways off - I'll make no predictions there), read, and listened to language. We will also have the capability to convert between all four (Write down a word in English, have it displayed and spoken out in French)

2020 will be the year that we'll start to see reasonably good translation systems that take into account some amount of nuance beyond word-for-word translation. This will be in a research environment, but will quickly move out of that environment into commercial applications.

2025 will be the year that Translation, as a skill set, will start to be replaced by machines - in particular, subtitling for movies into local languages will be predominantly done by machine system for all but the highest end productions.

2030 will be the year that we can write, read, speak, and understand, anyone in the world in each of our native languages, anywhere, any time. It will also be the year that Language Translation systems will be seen as a reasonable alternative to human translators.

So, as long as you plan on being a translator before 2030 - you should be okay - after that, all bets are off as to whether you have a career.

[+] cageface|15 years ago|reply

This just brings you into the uncanny valley of NLP. To say that this is almost as good as native fluency is like saying that current CG animation is almost photo-realistic. Like so many things in technology, closing the gap of the last percent turns out to be as hard as the first 99 put together.

The need for phrasebook-level fluency in another language may be diminished soon by technologies like these though.

[+] loewenskind|15 years ago|reply

Language is more than just words. It's culture, a way of thinking, a new point of view. If you're not interested in those things than you almost certainly never needed to learn a foreign language (English will get you by in most places and if you don't care about culture, why worry about the places it wont?).

[+] aw3c2|15 years ago|reply

Learning a foreign language is fun! Hell, I enjoy using English after all those years I was forced to learn it.

[+] spiffworks|15 years ago|reply

Just tried it, works only for the English-Spanish pair right now, but the reliability is pretty good considering that they're calling it an Alpha.

[+] gms|15 years ago|reply

The real question is how far it will move from alpha, if at all.

[+] pfarrell|15 years ago|reply

My hovercraft is full of eels. Sorry, couldn't resist. Natural language processing, could we really see major improvements in our lifetimes? My gut tells me there are so many nuances to the way we work that the best we can get to will have to include many of the inconsistencies we have in understanding each other.

[+] erikstarck|15 years ago|reply

The vodka is strong but the meat is rotten.

[+] fhars|15 years ago|reply

Except that if you actually travel in parts of the world where you would need this service, international data roaming charges for accessing this service will be so outrageous that it may just be cheaper to hire a professional interpreter to travel with you.

[+] johnyzee|15 years ago|reply

Has anyone else been waiting for this 'invention' forever? Between speech recognition and machine translation it seemed to be a solved problem for the longest time.

[+] trurl123|15 years ago|reply

At first google should improve http://translate.google.com to show valid translations.

[+] andrewljohnson|15 years ago|reply

Douglas Adams was mostly right. It's just not a fish.

[+] tocomment|15 years ago|reply

AnyOne know when this will be available to install? I'm itching to try it out.

[+] pedanticfreak|15 years ago|reply

I have nothing meaningful to contribute, but I want to say this is a monumental step forward.

Some people here seem to predict this heralds the end of language education for most people. Maybe that's true. But I actually think this will contribute to earlier and more frequent exposure to foreign languages. And that may ultimately do more for language learning than compulsory education ever did.

[+] chiquillo|15 years ago|reply

[deleted]

[+] JunkDNA|15 years ago|reply

"Google is quick to note that this is very much an alpha feature. In other words, expect a lot of hiccups. They note that background noise, thick accents, and quick speech can all trip up the app."

Wow, so it's not practically useful very frequently?

[+] minalecs|15 years ago|reply

dude.. google is getting closer to a real time universal speech translator and you find a way to be unimpressed. What does impress you ?

[+] brown9-2|15 years ago|reply

Who would have thought that alpha software is in fact very alpha-like in it's qualities?

[+] felipe|15 years ago|reply

The "background noise, thick accents, and quick speech" is actually the most difficult part.

Google did not really do anything new. Speech recognition is around for a while, as well as translation engines. Although putting the two technologies on the cell phone makes an impressive demo and good PR, it does not actually solves the problem at a practical level.

64 comments