top | item 36758390

(no title)

tiluha | 2 years ago

The trained model would likely be able understand both Alienese and English equally well, but it never learned to translate even one word or context. It might have an internal representation for "eating food" in both languages, but since since no links exists between the languages the embeddings will not be close.

You could try it on earth with if you train a model on two separate languages, being careful that the traning data does not contain any mixed language. But even then, modern Human languages most likely have too much cross-contamination. Would be an interesting experiment nevertheless

discuss

cubefox|2 years ago

That's the question, would the embeddings be close?

It's not clear that they wouldn't. Would an embedding of the Alienese word for "and" be close to the embedding of the English "and"? This does seem quite possible to me.

> You could try it on earth with if you train a model on two separate languages, being careful that the traning data does not contain any mixed language. But even then, modern Human languages most likely have too much cross-contamination. Would be an interesting experiment nevertheless

I agree. Though shouldn't we be able to answer this a priori? It sounds like a mathematical question.