(no title)
agarsev | 2 years ago
In any case, text seems to stil form a part:
> During training, we use monolingual speech-text datasets
So there's still a way till machines learn language as humans do, i.e. with sounds as primary modality. But nowadays I won't bet as to how long any ml task for language will take to be solved
agarsev|2 years ago
1) they use a single, shared embedding space for the two languages, forcing the model to learn "semantics" independently (or rather, interdependently) of language 2) using back-translation for training. I'm not sure that I got this right, but this seems to be round-trip translation? So the model can self-assess its performance by checking the spanish->english->spanish difference.
Sounds very promising and interesting! However, it seems they only tested on spanish and english. I wonder if the similarity of the languages at the lexical level made these results possible.
IanCal|2 years ago