Huh, if it's good enough, movies/TV shows dubbed with AI-clone of the original voice would be great (if we can ignore the ethics of using the actor's voice and the loss of work for the dubbing companies and actors).
Yes, have you used OpenAI’s voice model? It uses and reacts to tones
My favorite conversation has been getting it to tell me about marshmallow vs marshmellow spelling and pronunciation, it became very strict but patient with me
It can reply in other languages too, but I cant detect dialect as well to say
In my experience, human dubbing never captures the original tone anyway. Probably never can unless it's done by people fluent in both source and target languages and're also good at voice acting. And so I have a huge preference for subs so I can appreciate the nuance in the original voices.
Can't talk for the German dubbing, but the Italian version sounds natural to us Italians while the original, English version, is hard to relate to and create a bond with. The dubbing makes it "close" to home if that makes sense. You might feel it's weird because you've grown accustomed to watching the original version while also immersed in everything that sitcom portrays.
I want to learn swedish and because there are so few dubbed movies in Swedish I take the subtitles(Netflix is good at having subtitles in different languages) and text-to-speech it :)
Unlike other methods of automation, AI is replacing human beings too fast. And before you say, "new jobs will be created" -- look at history. After the computer, new jobs have been created, but what kind of jobs? Every year, we are becoming more entwined in wage slavery as the wealth accumulates at the top and jobs become more meaningless.
So, no, new jobs will not be created, except the kind of jobs that crush the human spirit into oblivion so that the rich tech oligarchs can play God.
The problem is that we are hurtling towards the unknown without a plan, driven by the “need” to make higher returns for shareholders and to capture the new market.
It may be that some new types of employment magically appear that soak up the jobs lost, but you can be sure there is no one working on solving that problem since the goal is to eliminate labor not create it.
the reaction is wrong. it shouldnt be "oh no, jobs are being removed" but "nice, less work more automatization, let's make sure we all benefit through less work and not only the rich with more profits"
>...new jobs will not be created, except the kind of jobs that crush the human spirit into oblivion...
AI certainly means everyone will be able to create 'art' and as a result we'll have more art than we know what to do with, music and images are already confetti, soon so will full length 'films/movies'. That leaves anyone who can actually sing, paint, play, dance, in prime position to take up those mantles.
> After the computer, new jobs have been created, but what kind of jobs? Every year, we are becoming more entwined in wage slavery as the wealth accumulates at the top and jobs become more meaningless.
What are you talking about? Many of us have tech jobs with much more comfort, creativity and autonomy than the jobs they displaced, and computerisation has made it much more practical for those who dare to strike out their own rather than needing wealthy family or friends before you can even begin to think of starting a business.
so maybe don't cling on to "the job" so much and hoping it somehow can fulfil your life. If the job can be automated by a machine then isn't it already meaningless and mundance and bore you to death anyway?
I agree with the point about "wealth accumulates at the top" though. Maybe Karl Marx was right about a thing or 2. Maybe the distribution of wealth should not fall into the hand of non-elected corporations. Whatever it is, it should be determined by a democratic process and not some "market mechanism" that is actually just arbitrary algorithms optimized for metrics no actual human cares about.
I don't think generative AI is replacing humans at all. It's like how SQL replaced software engineers, with added bonus of copyright doubts gatekeeping common folks from exploiting it. It's obviously killing open Internet fast and encouraging power concentration too. It's worst of couples of worlds.
Localization and dubbing is a sad endeavour. By trying to accomodate everyone's individual preference for information transmission we accomplish nothing more than reducing our ability to understand each other in the long run.
Having a Babelfish is all well and good. Until it stops working, and you realise no one can understand each other any more.
Ironically localization is often pushed by well meaning Americans who only speak one language. "Oh, you're in a French speaking region. You MUST want French language. Let me force it down your throat while I prance around virtue signalling about how inclusive we are"
This is a terrible take, and you should have at least included the "forced" dub disclaimer from your comment below. Without at least one of sub/dub, foreign (relative to your current location!) language content basically doesn't get consumed at all except by a very small minority of people who are very keen on the content anyway - or are speakers of the language anyway.
Now, as veterans of anime forum wars will know, subtitling is nearly always better than dubbing, and I hope this tech is capable of that as well. Most media systems let you put a whole load of subtitle tracks on and then pick one.
There's far, far too much content out there for more than a fraction of it to be ever professionally translated. While we should expect human translation review and a spot of localization for officially released works, most of the internet is just free content being given away for very little return. And that's where automatic translation is going to shine: release the non-English meme champions! Let us have a look in Bilibili!
Are you saying it's a bad thing if the creator of a work decides they want it localized or dubbed into other languages? I don't understand why you want to take that choice out of their hands.
Which languages someone speaks isn't simply a matter of "individual preference". Learning a new language takes a lot of time and energy, and people only have the time to learn a handful of languages in most cases unless they can make a career out of linguistics.
i.e. I know a sprinkling of words in various languages, and I've started learning Japanese, but I simply don't have the time to also learn Mandarin, Korean, Cantonese, etc. So I appreciate it when authors of works in those languages offer localizations into a language I can speak, or when third parties spend their time translating stuff for free to make it available to a wider audience.
What's the advantage of closing knowledge and communication off from a wider audience?
Maybe I'm misunderstanding and you're just angry about Google Translate/DeepL etc (which I have a strong distaste for since they're Fake)
As I understand, it first extracts text from original video into subtitles, translates them using external LLM, and then converts text to speech. All of this is done using thrid-party solutions, and the project seems to be just a GUI app that allows to integrate them.
You obviously cannot use this to translate songs or movies because this method loses important information like voice, intonation, etc.
Back in high school, when I got my first PC a plumber came over to fix some stuff and when he saw the computer he got excited and asked some questions and one of the questions was “how do you translate the VCD with this, I have a movie to watch but hate subtitles”.
I was like “silly dude doesn’t know how computers work” but maybe I was the silly one who can’t dare to imagine how something like that can work.
Yandex browser does the most impressive version of this and for free but only to Russian I believe, its quite amazing it does appropriate different voices and follows the correct intonation for everyone, just takes a few seconds for a YT video.
This could be useful in combating fake news. In many videos especially in political news, foreign languages are dub over with sometimes nuanced translation that can skew audiences to (mis)understand the content in certain ways.
A translation lacking nuance/precision (due to being the work of machine learning) can also cause significant misunderstandings, though. I'm not sure you win or lose in that regard by switching from humans to machines.
Not open source, but https://fluen.ai does a good job at translating subs, while using your own or standards style guides for the target language - ie. reading speed, max characters, "chunking" sentences where it grammatically makes sense, re-adapting them etc.
netsharc|1 year ago
For example here's how weird Friends is in German: https://www.youtube.com/watch?v=nCoNSZV--z0 . Or Italian: https://www.youtube.com/watch?v=wO5qTzvyQ1s
Can AI detect the emotional tone of sentences yet, and recreate it in the target language?
yieldcrv|1 year ago
My favorite conversation has been getting it to tell me about marshmallow vs marshmellow spelling and pronunciation, it became very strict but patient with me
It can reply in other languages too, but I cant detect dialect as well to say
skeledrew|1 year ago
giorgiobalduino|1 year ago
Can't talk for the German dubbing, but the Italian version sounds natural to us Italians while the original, English version, is hard to relate to and create a bond with. The dubbing makes it "close" to home if that makes sense. You might feel it's weird because you've grown accustomed to watching the original version while also immersed in everything that sitcom portrays.
programjames|1 year ago
https://arxiv.org/abs/2312.01479
devindotcom|1 year ago
_moof|1 year ago
This is a shocking parenthetical.
cubbic|1 year ago
https://github.com/cubbK/dubbing_ai_netflix_client
I want to learn swedish and because there are so few dubbed movies in Swedish I take the subtitles(Netflix is good at having subtitles in different languages) and text-to-speech it :)
birktj|1 year ago
sam_perez|1 year ago
Do you think it's usable for learning? Seems like you could end up with some quirky learnings.
sunnybeetroot|1 year ago
vouaobrasil|1 year ago
So, no, new jobs will not be created, except the kind of jobs that crush the human spirit into oblivion so that the rich tech oligarchs can play God.
pizza234|1 year ago
insane_dreamer|1 year ago
Kiro|1 year ago
fleischhauf|1 year ago
IndySun|1 year ago
AI certainly means everyone will be able to create 'art' and as a result we'll have more art than we know what to do with, music and images are already confetti, soon so will full length 'films/movies'. That leaves anyone who can actually sing, paint, play, dance, in prime position to take up those mantles.
lmm|1 year ago
What are you talking about? Many of us have tech jobs with much more comfort, creativity and autonomy than the jobs they displaced, and computerisation has made it much more practical for those who dare to strike out their own rather than needing wealthy family or friends before you can even begin to think of starting a business.
hatenberg|1 year ago
Nobody currently can say which patterns it cannot extract, hence "we always figured out new jobs" is ... challenged
SoftTalker|1 year ago
Um, jobs where someone under age 30 can be earning hundreds of thousands of dollars a year programming them?
homarp|1 year ago
and the video can be the products, or tutoriala for another products. This allows me to do more, not less.
nsonha|1 year ago
I agree with the point about "wealth accumulates at the top" though. Maybe Karl Marx was right about a thing or 2. Maybe the distribution of wealth should not fall into the hand of non-elected corporations. Whatever it is, it should be determined by a democratic process and not some "market mechanism" that is actually just arbitrary algorithms optimized for metrics no actual human cares about.
nickthegreek|1 year ago
citation needed.
numpad0|1 year ago
skummetmaelk|1 year ago
Having a Babelfish is all well and good. Until it stops working, and you realise no one can understand each other any more.
Ironically localization is often pushed by well meaning Americans who only speak one language. "Oh, you're in a French speaking region. You MUST want French language. Let me force it down your throat while I prance around virtue signalling about how inclusive we are"
pjc50|1 year ago
Now, as veterans of anime forum wars will know, subtitling is nearly always better than dubbing, and I hope this tech is capable of that as well. Most media systems let you put a whole load of subtitle tracks on and then pick one.
There's far, far too much content out there for more than a fraction of it to be ever professionally translated. While we should expect human translation review and a spot of localization for officially released works, most of the internet is just free content being given away for very little return. And that's where automatic translation is going to shine: release the non-English meme champions! Let us have a look in Bilibili!
kevingadd|1 year ago
Which languages someone speaks isn't simply a matter of "individual preference". Learning a new language takes a lot of time and energy, and people only have the time to learn a handful of languages in most cases unless they can make a career out of linguistics.
i.e. I know a sprinkling of words in various languages, and I've started learning Japanese, but I simply don't have the time to also learn Mandarin, Korean, Cantonese, etc. So I appreciate it when authors of works in those languages offer localizations into a language I can speak, or when third parties spend their time translating stuff for free to make it available to a wider audience.
What's the advantage of closing knowledge and communication off from a wider audience?
Maybe I'm misunderstanding and you're just angry about Google Translate/DeepL etc (which I have a strong distaste for since they're Fake)
written-beyond|1 year ago
LeoPanthera|1 year ago
codedokode|1 year ago
You obviously cannot use this to translate songs or movies because this method loses important information like voice, intonation, etc.
So it is still better to use subtitles.
mrtksn|1 year ago
I was like “silly dude doesn’t know how computers work” but maybe I was the silly one who can’t dare to imagine how something like that can work.
gagabity|1 year ago
Yandex browser does the most impressive version of this and for free but only to Russian I believe, its quite amazing it does appropriate different voices and follows the correct intonation for everyone, just takes a few seconds for a YT video.
nsonha|1 year ago
kevingadd|1 year ago
pjc50|1 year ago
cyberax|1 year ago
I just physically can't watch them. I wanted to watch the Blackadder series, but I couldn't even get through one episode.
deckar01|1 year ago
If you can train an instrument model on laugh tracks Demucs should do that.
lossolo|1 year ago
alphabetatheta|1 year ago
underdeserver|1 year ago
hulitu|1 year ago
Is there any assessment about how good the translation is ?
CyberDildonics|1 year ago
tourmalinetaco|1 year ago
paulkon|1 year ago
randomgiy3142|1 year ago
ewuhic|1 year ago
fbnt|1 year ago
cyanydeez|1 year ago
ranger_danger|1 year ago
pcarion|1 year ago
https://github.com/jianchang512/pyvideotrans/blob/main/docs/...
unknown|1 year ago
[deleted]
microflash|1 year ago