top | item 30050360

(no title)

divingdragon | 4 years ago

> This was a surprising one to me. Simplified Chinese was expectedly more efficient than Traditional Chinese, but both were beaten out by Cantonese (which also uses traditional characters).

My first reaction to this is: "What? Cantonese?" I just knew something is wrong.

So I checked the data and immediately see the issue. `yue_Hant_HK` ("Cantonese"), `zh_Hant_HK` (Chinese, Hong Kong) and `zh_Hant_MO` (Chinese, Macau) all uses the same text, while `zh_Hant` (Traditional Chinese) and `zh_Hant_TW` (Chinese, Taiwan) uses a different text.

As it turns out, there is no Cantonese, just plain written Chinese (書面語, written language as we would call it). Both Hong Kong and Taiwan use the Traditional Chinese script, but the two don't exactly use the same vocabularies due to regional differences. For example, the term "privacy" is "私隱" in Hong Kong, but "隱私" in Taiwan. This is why there exists two versions of Traditional Chinese translations of the same text. Since they are likely done by different people (assuming they are not machine translations) they have different translation styles, which contributes to the difference in length of the two paragraphs.

(Also, Cantonese can be written in Simplified Chinese, but that's all I will say regarding this topic.)

discuss

order

princeb|4 years ago

the writing of cantonese is a complicated affair.

the only analogy i can cook up is that - imagine if formally, everyone wrote English in German, but when you speak you speak English, and when you read English you'd see German words and sentence structures but you will preprocess it into English before understanding it. and with different levels of formality, would You with underschiedenly Germandegrees speak English. but in no case would you speak German.