top | item 41234713

Open-source tool translates and dubs videos into other languages using AI

182 points| oldcai | 1 year ago |github.com

116 comments

order

netsharc|1 year ago

Huh, if it's good enough, movies/TV shows dubbed with AI-clone of the original voice would be great (if we can ignore the ethics of using the actor's voice and the loss of work for the dubbing companies and actors).

For example here's how weird Friends is in German: https://www.youtube.com/watch?v=nCoNSZV--z0 . Or Italian: https://www.youtube.com/watch?v=wO5qTzvyQ1s

Can AI detect the emotional tone of sentences yet, and recreate it in the target language?

yieldcrv|1 year ago

Yes, have you used OpenAI’s voice model? It uses and reacts to tones

My favorite conversation has been getting it to tell me about marshmallow vs marshmellow spelling and pronunciation, it became very strict but patient with me

It can reply in other languages too, but I cant detect dialect as well to say

skeledrew|1 year ago

In my experience, human dubbing never captures the original tone anyway. Probably never can unless it's done by people fluent in both source and target languages and're also good at voice acting. And so I have a huge preference for subs so I can appreciate the nuance in the original voices.

giorgiobalduino|1 year ago

> For example here's how weird Friends is in German: https://www.youtube.com/watch?v=nCoNSZV--z0 . Or Italian: https://www.youtube.com/watch?v=wO5qTzvyQ1s

Can't talk for the German dubbing, but the Italian version sounds natural to us Italians while the original, English version, is hard to relate to and create a bond with. The dubbing makes it "close" to home if that makes sense. You might feel it's weird because you've grown accustomed to watching the original version while also immersed in everything that sitcom portrays.

_moof|1 year ago

> (if we can ignore the ethics of using the actor's voice and the loss of work for the dubbing companies and actors)

This is a shocking parenthetical.

cubbic|1 year ago

Oh I made something similar but for Netflix!

https://github.com/cubbK/dubbing_ai_netflix_client

I want to learn swedish and because there are so few dubbed movies in Swedish I take the subtitles(Netflix is good at having subtitles in different languages) and text-to-speech it :)

birktj|1 year ago

Why not simply watch native Swedish content instead? There should be quite a lot globally available for free from SVT.

sam_perez|1 year ago

How would you rate the quality of the dubs?

Do you think it's usable for learning? Seems like you could end up with some quirky learnings.

sunnybeetroot|1 year ago

This is fantastic! Any direction on if I wanted to change the language to something else?

vouaobrasil|1 year ago

Unlike other methods of automation, AI is replacing human beings too fast. And before you say, "new jobs will be created" -- look at history. After the computer, new jobs have been created, but what kind of jobs? Every year, we are becoming more entwined in wage slavery as the wealth accumulates at the top and jobs become more meaningless.

So, no, new jobs will not be created, except the kind of jobs that crush the human spirit into oblivion so that the rich tech oligarchs can play God.

insane_dreamer|1 year ago

The problem is that we are hurtling towards the unknown without a plan, driven by the “need” to make higher returns for shareholders and to capture the new market. It may be that some new types of employment magically appear that soak up the jobs lost, but you can be sure there is no one working on solving that problem since the goal is to eliminate labor not create it.

Kiro|1 year ago

Strange submission to post this comment on. It's not like translating and dubbing videos is the highest form of labor.

fleischhauf|1 year ago

the reaction is wrong. it shouldnt be "oh no, jobs are being removed" but "nice, less work more automatization, let's make sure we all benefit through less work and not only the rich with more profits"

IndySun|1 year ago

>...new jobs will not be created, except the kind of jobs that crush the human spirit into oblivion...

AI certainly means everyone will be able to create 'art' and as a result we'll have more art than we know what to do with, music and images are already confetti, soon so will full length 'films/movies'. That leaves anyone who can actually sing, paint, play, dance, in prime position to take up those mantles.

lmm|1 year ago

> After the computer, new jobs have been created, but what kind of jobs? Every year, we are becoming more entwined in wage slavery as the wealth accumulates at the top and jobs become more meaningless.

What are you talking about? Many of us have tech jobs with much more comfort, creativity and autonomy than the jobs they displaced, and computerisation has made it much more practical for those who dare to strike out their own rather than needing wealthy family or friends before you can even begin to think of starting a business.

hatenberg|1 year ago

Transformer based AI is basically printing machines for the knowledge economy from existing labor patterns.

Nobody currently can say which patterns it cannot extract, hence "we always figured out new jobs" is ... challenged

SoftTalker|1 year ago

> After the computer, new jobs have been created, but what kind of jobs?

Um, jobs where someone under age 30 can be earning hundreds of thousands of dollars a year programming them?

homarp|1 year ago

can't I with that creates video that have the world as potential users?

and the video can be the products, or tutoriala for another products. This allows me to do more, not less.

nsonha|1 year ago

so maybe don't cling on to "the job" so much and hoping it somehow can fulfil your life. If the job can be automated by a machine then isn't it already meaningless and mundance and bore you to death anyway?

I agree with the point about "wealth accumulates at the top" though. Maybe Karl Marx was right about a thing or 2. Maybe the distribution of wealth should not fall into the hand of non-elected corporations. Whatever it is, it should be determined by a democratic process and not some "market mechanism" that is actually just arbitrary algorithms optimized for metrics no actual human cares about.

nickthegreek|1 year ago

>AI is replacing human beings too fast.

citation needed.

numpad0|1 year ago

I don't think generative AI is replacing humans at all. It's like how SQL replaced software engineers, with added bonus of copyright doubts gatekeeping common folks from exploiting it. It's obviously killing open Internet fast and encouraging power concentration too. It's worst of couples of worlds.

skummetmaelk|1 year ago

Localization and dubbing is a sad endeavour. By trying to accomodate everyone's individual preference for information transmission we accomplish nothing more than reducing our ability to understand each other in the long run.

Having a Babelfish is all well and good. Until it stops working, and you realise no one can understand each other any more.

Ironically localization is often pushed by well meaning Americans who only speak one language. "Oh, you're in a French speaking region. You MUST want French language. Let me force it down your throat while I prance around virtue signalling about how inclusive we are"

pjc50|1 year ago

This is a terrible take, and you should have at least included the "forced" dub disclaimer from your comment below. Without at least one of sub/dub, foreign (relative to your current location!) language content basically doesn't get consumed at all except by a very small minority of people who are very keen on the content anyway - or are speakers of the language anyway.

Now, as veterans of anime forum wars will know, subtitling is nearly always better than dubbing, and I hope this tech is capable of that as well. Most media systems let you put a whole load of subtitle tracks on and then pick one.

There's far, far too much content out there for more than a fraction of it to be ever professionally translated. While we should expect human translation review and a spot of localization for officially released works, most of the internet is just free content being given away for very little return. And that's where automatic translation is going to shine: release the non-English meme champions! Let us have a look in Bilibili!

kevingadd|1 year ago

Are you saying it's a bad thing if the creator of a work decides they want it localized or dubbed into other languages? I don't understand why you want to take that choice out of their hands.

Which languages someone speaks isn't simply a matter of "individual preference". Learning a new language takes a lot of time and energy, and people only have the time to learn a handful of languages in most cases unless they can make a career out of linguistics.

i.e. I know a sprinkling of words in various languages, and I've started learning Japanese, but I simply don't have the time to also learn Mandarin, Korean, Cantonese, etc. So I appreciate it when authors of works in those languages offer localizations into a language I can speak, or when third parties spend their time translating stuff for free to make it available to a wider audience.

What's the advantage of closing knowledge and communication off from a wider audience?

Maybe I'm misunderstanding and you're just angry about Google Translate/DeepL etc (which I have a strong distaste for since they're Fake)

written-beyond|1 year ago

Interesting take, never really looked at it that way.

LeoPanthera|1 year ago

For years I've wanted this for live TV. Even just subtitles would be amazing. I've always wanted to be able to watch news TV from other countries.

codedokode|1 year ago

As I understand, it first extracts text from original video into subtitles, translates them using external LLM, and then converts text to speech. All of this is done using thrid-party solutions, and the project seems to be just a GUI app that allows to integrate them.

You obviously cannot use this to translate songs or movies because this method loses important information like voice, intonation, etc.

So it is still better to use subtitles.

mrtksn|1 year ago

Back in high school, when I got my first PC a plumber came over to fix some stuff and when he saw the computer he got excited and asked some questions and one of the questions was “how do you translate the VCD with this, I have a movie to watch but hate subtitles”.

I was like “silly dude doesn’t know how computers work” but maybe I was the silly one who can’t dare to imagine how something like that can work.

gagabity|1 year ago

Cool what languages can it do?

Yandex browser does the most impressive version of this and for free but only to Russian I believe, its quite amazing it does appropriate different voices and follows the correct intonation for everyone, just takes a few seconds for a YT video.

nsonha|1 year ago

This could be useful in combating fake news. In many videos especially in political news, foreign languages are dub over with sometimes nuanced translation that can skew audiences to (mis)understand the content in certain ways.

kevingadd|1 year ago

A translation lacking nuance/precision (due to being the work of machine learning) can also cause significant misunderstandings, though. I'm not sure you win or lose in that regard by switching from humans to machines.

pjc50|1 year ago

The AI is quite capable of inserting its own translation errors.

cyberax|1 year ago

I would pay a lot for a tool that removes the freaking laugh track from videos.

I just physically can't watch them. I wanted to watch the Blackadder series, but I couldn't even get through one episode.

lossolo|1 year ago

Based on english doc it seems it's not dubbing but voice over.

hulitu|1 year ago

> Open-source tool translates and dubs videos into other languages using AI

Is there any assessment about how good the translation is ?

CyberDildonics|1 year ago

Is this using an open source text to speech model or is it going out to some other internet service?

tourmalinetaco|1 year ago

It can use Whisper which is open source, although optionally it can also use GoogleSpeech.

paulkon|1 year ago

Is there an open source speech-to-speech model which retains intonation, cadence and delivery?

randomgiy3142|1 year ago

Because translations are copyrighted so it is complex to get legal rights for them.

ewuhic|1 year ago

This one does dubbing, but is there an equivalent tool for subs?

fbnt|1 year ago

Not open source, but https://fluen.ai does a good job at translating subs, while using your own or standards style guides for the target language - ie. reading speed, max characters, "chunking" sentences where it grammatically makes sense, re-adapting them etc.

cyanydeez|1 year ago

So uh, what's with AI products throwing out the gold standard in testing these claims.