(no title)
supergarfield | 4 years ago
Going from spelling to pronunciation in French follows (admittedly complex) rules that are rarely broken except for common words (or endings such as -ent). Vowel pronunciations for a given spelling are far more variable in English, and often depend on the etymology of the word. Plus, English has word-level stress that is not marked in writing (French has none, and it's marked in Spanish), and moving the stress will usually make a word unintelligible! That alone makes writing => pronunciation very difficult.
gecko|4 years ago
Unsurprisingly, we can vaguely quantify this by looking at dyslexia amongst languages. English and various Southeast Asian languages that rely on Chinese ideographs are by far the worst, followed by things like Arabic, French, Hebrew, and German that have fewer exceptions but less guidance, and then followed last by things like Spanish, Cherokee, and so on that are truly one-to-one.
eindiran|4 years ago
There are a number of ways currently used, but I have a new one to propose: compare the size of two G2P models (1 for each language), which have similar RMS errors. Assuming they are generated using similar techniques, the one which requires the bigger model probably has a less clean phoneme-to-grapheme correspondence.
jgwil2|4 years ago