top | item 14060171

Bad Character: If Chinese Were Phonetic

81 points| bemmu | 9 years ago |newyorker.com

100 comments

order
[+] sparky_z|9 years ago|reply
Is it just me, or does this essay seem to be missing an introduction, like someone accidentally deleted the first few paragraphs? I initially assumed I had been linked to "page 2" and went looking for a link back to the beginning.

Edit: Apparently, this is part of a series where guest authors are invited to choose something to "uninvent" and explains why they think it has had a negative impact on the world. The essay's abrupt beginning makes much more sense in that context.

[+] truthexposer|9 years ago|reply
I've seen this sentiment a lot in Chinese-Americans that are not educated in linguistics, along with other self-loathing sentiments.

First, literacy isn't completely related to the writing system. Look at Spanish speaking countries, where the alphabet is more phonetic than the English alphabet.

Second, Chinese characters that are more complex, i.e. consists of more than one radical, are usually composed with a semantic component, giving indication to the character's meaning, and a phonetic component, which gives an indication to the sound of the character. Although this isn't a rule, it helps a lot, and it's not like English doesn't have crazy non-phonetic spellings as well (how tf is "through" supposed to be pronounced for a English learner?)

Last, the Chinese language consists of MANY homophones. This isn't necessarily a bad thing, and not one out of design, but something that is the result of being one of the oldest language families in the world. It allows for the concise expression of many things using only single syllables. You might say, but what about the crazy amount of ambiguity if the language has a lot of homophones? Well, ambiguity is a huge problem in all languages and our brains seem to manage. Now, even though homophones aren't a big problem in spoken language, because of intonation and prosody giving a clue to how to analyze sentences, written language is a different story, and it would be very hard to make an easier system to handle it. For all you engineers, the fact that Chinese has characters is essentially a performance trade off. More information density for more ambiguity.

[+] Banthum|9 years ago|reply
It's actually a myth that most Chinese characters have a semantic component (indicating meaning).

And the phonetic component often doesn't correspond to anything any modern person would know.

The problem in both cases is the shift of language through China's long history, and its divergence from the original design of the written characters.

The problem with the semantic components is that the meaning is re-used and stretched over and over. E.g. they used to use a 'foot' radical to represent a journey towards a destination. Later that shifted to mean "in a straight line, not veering left or right". Later that took on the meaning of "straight and narrow" or "straight shooter" or "not-deviant". Then it becomes, "not deviating from the right path". So now, the old foot radical typically means "justice" or "correct".

And the image shifts over time. This is the old foot radical now: 正 Does it look like a foot to you?

The problem is with the phonetics is language shifts. In many cases, in ancient Chinese, the characters do have a phonetic component that hints at pronunciation. but, the pronunciation changed over the last 2000-3000 years, so the pronunciation hint that made perfect sense in the Han dynasty is now meaningless because you're speaking a different language.

The result is that in the end the characters end up being arbitrary phonetic symbols with some arbitrary meanings attached.

[+] Analemma_|9 years ago|reply
Historically, there have been repeated attempts in both China and Japan to replace Chinese characters with a phonetic system (and Korea and Vietnam actually went through with it). These efforts haven't panned out, but it's demonstrably not the case that only ignorant émigrés think a phonetic system would be better.
[+] yongjik|9 years ago|reply
> ... something that is the result of being one of the oldest language families in the world.

That makes no sense. If we had a time machine, we'll be able to trace every modern language back to some African tribe that became the ancestor of all modern humans.

Perhaps you meant having a literary tradition that's among the oldest in the world, but even that doesn't necessarily explain why there would be so many homophones. Languages lose words all the time, even one with great literary traditions.

[+] DonaldFisk|9 years ago|reply
> it's not like English doesn't have crazy non-phonetic spellings as well (how tf is "through" supposed to be pronounced for a English learner?)

English is probably the worst case among alphabetic languages, but you can still usually guess the exact pronunciation of an unfamiliar word.

> the Chinese language consists of MANY homophones.

But still Chinese speakers make themselves understood without difficulty, even over the telephone. Most Chinese words are polysyllabic. Taking that into account, homophones are rarer than you might think.

There's a language+, Dungan (https://en.wikipedia.org/wiki/Dungan_language), which is to some extent mutually intelligible with Mandarin. It's written entirely in the Cyrillic alphabet without diacritics to indicate tone. Sample text: http://www.omniglot.com/babel/dungan.htm

+ You could call it a dialect, but it has its own script.

[+] thedz|9 years ago|reply
> I've seen this sentiment a lot in Chinese-Americans that are not educated in linguistics, along with other self-loathing sentiments.

FWIW, I think Ted Chiang (author of Story of Your Life that Arrival was based on) knows a decent amount on linguistics.

At this point, Chinese characters aren't going to go anyhere. But I think it's an interesting thought experiment to puzzle through.

[+] contravert|9 years ago|reply
I agree with you. The author even admits right off the bad that he failed to learn the language. I don't understand the mindset of someone who would write an article critiquing Chinese characters, knowing so little about them.

If the author can be taken this seriously given his self-professed ignorance, I am probably much more qualified than him to speak about Chinese characters.

There are two points worth disambiguating here. One is whether Chinese characters hinder the literacy of native speakers and the other is if it hinders that for learners. Neither I nor the author have any authority to speak about the former.

As a heritage speaker, I actually only recently learned Chinese to a level where I consider myself literate. My experience was that the characters were not an obstacle, like the author suggests, but an indispensable tool for rapidly learning the language.

To learn any language, it is unavoidable that you need to memorize thousands of new words. Memorizing a Chinese character is not much harder than memorizing a word. However, the magic of Chinese characters comes when you combine them to form actual words.

The vast majority of Chinese words consists of 2 characters, but because each character also encodes meaning, you can more often than not guess the meaning of a word you have never seen before.

Although you can take advantage of common roots for words in other languages, the scale is simply incomparable. For other languages, you pretty much have to memorize every new word you see.

My experience has been that learning Chinese characters may be a higher upfront cost (although I disagree), once you learn enough, you rapidly understand way more vocabulary because characters themselves encode semantic information.

Perhaps it's simply the lack of any cognates between English and Chinese, and a tinge of cultural supremacy, that make people, like the author, think it's Chinese characters that's the root of everything that makes the language difficult to learn.

[+] interfixus|9 years ago|reply
... a lot in Chinese-Americans that are not educated in linguistics, along with other self-loathing sentiments

The world's languages, of course, developed nicely without intervention from linguistics, a latter-day mostly retrospective discipline.

[+] T-R|9 years ago|reply
> First, literacy isn't completely related to the writing system. Look at Spanish speaking countries, where the alphabet is more phonetic than the English alphabet.

Perhaps more to the point, Japan, which uses Chinese characters for nouns, adjectives, adverbs, and verb-roots, has a very high literacy rate. [1]*

[1] https://nces.ed.gov/pubs2014/2014008.pdf

* Not trying to cherry-pick for Japan being at the top, but Japan's not listed in the 2015 UNESCO report that's referenced all over the place. This US DoE report from 2013 seems legitimate enough.

[+] bluGill|9 years ago|reply
English needs to reform out written language as well. It isn't as bad as Chinese, but there is a reason "only two languages commonly have writing competitions: Chinese and English." (I'm not sure who to attribute this to, it isn't mine though I probably got it wrong)

Spanish adults read at a 5th grade level, English adults read at a 6th grade level, Japanese adults read at a 9th grade level - this isn't a reflection of education it is a reflection on how difficult the written language is to learn.

As a bad speller with a passing knowledge of Spanish I'm jealous: when I hear a Spanish word I can spell it, with English I have no clue.

[+] malandrew|9 years ago|reply
How does Chinese handle the propagation of neologisms?

A rough 1:1 phonetic correspondence between oral and written languages intuitively (to me at least) would be more fit for the creation and propagation of news words, or am I missing some other natural form of conveyance and propagation that exists in the Chinese writing system?

[+] lstyls|9 years ago|reply
Color me shocked that somehow the author found the alphabet of his native language to be superior.

This article takes as a given some assumptions that I don't understand at all. China being resistant to change? There are few cohesive societies that I can think of that have experienced more change than China since the end of WWII. And it would be an understatement to say that the rise of literacy rates are more strongly correlated with industrialization than adoption of phonetic alphabets.

I think it's telling that the author grew up in a Chinese-American community. Expat communities tend to lag their mother cultures in terms of social progress. An experience growing up in a community could explain this "steeped in tradition" characteristic that is attributed to Chinese culture here.

Disclaimer: I'm a white American. My wife is Chinese and emigrated herself from China as a young adult; I get a lot of my perspective from her. Would be interested to hear what Chinese members of HN think of the article.

[+] jbooth|9 years ago|reply
As far as "change" and "cohesive society", it's possible that the only reason that the Chinese have a 2500-year-old cohesive society is because of initial resistance to change. How else can you have one "society" across a dozen dynasties, from the ancient world through the medieval world to the modern one?

It's a lot harder to claim that some old druids or even the anglo-saxons are the same society as modern british-descended english speakers.

[+] hsitz|9 years ago|reply
". . . somehow the author found the alphabet of his native language to be superior."

What? First of all, it doesn't appear that Chinese was the author's native language, given that he was "forced to attend Saturday morning Chinese school" as a child, and did poorly. He was born in New York state and currently lives near Seattle.

Second, and more importantly, Chiang lists several advantages of phonetic systems over purely-character based systems, which hold equally whether we're talking about Chinese or any other written languages. So even if Chinese were his native language, the differences he notes between phonetic and character-based writing systems are general differences that would exist in any language that had both systems of writing.

It's not clear to me, however, that he thinks the differences make phonetic systems "superior". The character-based system has its own advantages it gains from not being tied to pronunciation, e.g., easy readability of classic texts. And Chiang expressly says that "he has no idea if he would be better off in a world" where there had never been a Chinese character-based writing system.

Regarding the rate of change in China, you say that there has been rapid change "since World War II". It seems to me that this rapid change has been possible precisely because China was so backward before then, resistant to change for centuries and millenia before WW II. I suspect -- but have no idea myself -- that the recent rapid change has been enabled in large part by heavier reliance on a phonetic writing system, which as Chiang correctly points out is much more suitable for use with computers.

[+] phaed|9 years ago|reply

[deleted]

[+] thedz|9 years ago|reply
For context, to forestall some knee jerk reactions I'm seeing:

1. This essay is part of a series, where guest authors are explicitly asked to "uninvent" something. So there's a level of built-in hot take to this.

2. The author is Ted Chiang, writer of Story of Your Life (and the work that the movie Arrival was based on). He's explored how language affects how we think before (Story of Your Life/Arrival is explicitly about this). So this kind of falls in line with that.

Anyway, I think this is a fascinating thought experiment to work through. What would China be like with a phonetic system? What would change? How much of the culture is derived from the method of language and how much from other factors?

[+] Nadya|9 years ago|reply
>I would never have to read or hear any more popular misconceptions about Chinese characters—that they’re like little pictures, that they represent ideas directly, that the Chinese word for “crisis” is “danger” plus “opportunity.” That, at least, would be a relief.

But... they are? They're ideograms and there is at least some reasoning to many of them. Things written with a part of "fire" tend to have a relation to..well...fire and heat.

火, 烧 炊. You might see the relation 秋 shares with fire. But don't fall into the trap that it is 100% consistent because you'd be wrong about 約.

And the last statement isn't even (technically) wrong. I'm not sure if the Chinese word is the same [0] but for Japanese it is 危機. With individual readings of "danger" and "opportunity" although a more reasonable 1-to-1 equation would be "danger" and "occasion". A crisis is a dangerous occasion. 機 happens to have several meanings and one of those is "opportunity". It is probably intentionally misleading but not incorrect to say it is danger + occasion.

You can't have your cake and eat it too. For most compound words the individual meanings of the hanzi/kanji are very relevant - maybe not always relevant but more often than not.

[0] Google Translate tells me it is, so my confidence is >0% but not by much.

[+] gizmo686|9 years ago|reply
Most of the time (In Japanese, at least), the Kanji do not map to meaning per-se, but to roots. For example, in English, we have words like "lap-top", "uni-cycle", "roof-top", "blue-berry", that clearly have multiple semantic components within them. In these cases, in Japanese, the words would be written with the kanji for their roots. In more linguistic terms, the kanjis often refer to morphemes, not ideas.

There are some exceptions. For example, in Japanese, today (kyou) is written as "今日”, even though it is a single morpheme. In cases such as this the kanji are used for their meanings. In cases such as this, is is not clear how much the kanji actually help, because the sementic information provided is precisly the information that you would get from knowing the roots, and you (or rather a native speaker) does not have to be taught roots (or at least those roots that still have semantic meaning).

[+] mdturnerphys|9 years ago|reply
For those unaware, Ted Chiang wrote the short story that the 2016 film Arrival was based on.
[+] gumby|9 years ago|reply
>Imagine a world in which written English had changed so little that works of “Beowulf”’s era remained continuously readable for the past twelve hundred years.

This works with an alphabetic system: in Iceland kids read the sagas in school.

[+] waqf|9 years ago|reply
And in fact, written English has changed very little in the past five hundred years or so: it's easy to read Shakespeare and not too hard to read Chaucer, although their pronunciation would have been considerably different. Before that period there was a much greater rate of change in the written language.
[+] failrate|9 years ago|reply
It is the intense educational requirements of the Chinese character system that led to the development of Hangul-am, the Korean character system. You still end up with a similar block structure, but it is phonetic and has about the same number of components as there are letters in the English alphabet.
[+] smilekzs|9 years ago|reply
That might be true, but an (perhaps more) important reason is literally the first 8 (chinese) characters in Hunminjeongeum [1], translated as follows:

> Because the speech of this country is different from that of China, it [the spoken language] does not match the [Chinese] letters.

[1]: https://en.wikipedia.org/wiki/Hunminjeongeum

[+] pqhwan|9 years ago|reply
It may be misleading to call it a "similar block structure". The structure may have been inspired by the way chinese characters are composited to form new ones, but it's really just a way of chunking phonetic letters into syllabic units.
[+] kmicklas|9 years ago|reply
Some linguists (mostly Western) have advocated transitioning to a mixed system of characters and pinyin. I don't think that will ever happen because an uglier and less harmonious writing system could not possibly be devised. Chinese would be much better off with something syllabic and square, like Hangul. With some amount of fiddling I think it could be made to work for most of the different Chinese languages and potentially even elucidate cognates between them, retaining one of the main benefits of characters.
[+] yongjik|9 years ago|reply
FWIW, I like Jared Diamond's thesis better, which is that the geography of China (a big habitable landmass with no sizable peninsulas or isolated areas) made a single political entity inevitable. It probably explains why the Chinese culture has more emphasis on tradition (if that is true, I mean): you can more easily identify yourself with your ancestors from 500 BC, because they are all Chinese. A Spaniard would have a harder time identifying with Celts, Romans, or Moors.
[+] divbit|9 years ago|reply
I think in a software startup mindset it could be easy to think that more efficient => better, but a point not mentioned in the essay is that written Chinese is simply beautiful compared to e.g., the simple 26 character alphabet, which is quicker to learn. This also applies to some of the other alphabets such as Hindi, Arabic, etc. But Chinese has so many characters, it really shines.
[+] wsxcde|9 years ago|reply
As someone who knows a bunch of different scripts (Perso-Arabic/Urdu, Brahmi-based scripts like Devanagari, Kannada, Tamil, Bengali and obviously the Roman script), scripts do make a difference.

Yes, a lousy script isn't a fatal impediment. You can produce beautiful literature even with a poorly-designed script. And I suspect script complexity is only weakly correlated to literacy. Learning to read involves a lot more than recognizing characters and words. But a bad script definitely adds a layer of confusion that is completely unnecessary and ends up creating an elitist and cumbersome language.

The standard counterpoint to complaints about scripts is some form of whataboutism involving English. Yes, English is not remotely phonetic, but the modern form of the Roman script for English is actually pretty decent. It has a lot of visual differentiation between different letters. Compare this to Persian and Arabic where the placement of a dot or three makes all the difference between wildly different sounds.

The Roman script also has relatively few letters. This comes at the cost of ambiguity, and we need to make up sequences for certain sounds (th, sh, ch), but also means the learning curve is a lot less steep. Persian and Arabic have the huge mess involving initial, medial, final and standalone forms of a letter -- it serves no purpose really. Brahmi-based scripts are all abugidas which means vowels are folded into adjoining consonants and so you need to learn about 4X as many symbols as English.

You do have to guess what a letter sounds like in English (g,j,k,c). But this is also pretty common. Tamil doesn't distinguish between voiced and unvoiced consonants. This is an unnecessary annoyance. A much worse example is of Persian, Arabic and Urdu where the script allows one to drop a lot of vowels. This is especially bad for Urdu. For example, the Hindustani words for here (idhar) and there (udhar) are written identically in Urdu, so you need to "backtrack" to fill-in vowels appropriately based on the rest of the sentence.

Yes, English isn't phonetic, and is unstructured and unsystematic. But it is still a pretty decent script compared to some of the others out there. From an Indian perspective, I wish we could move to one script -- perhaps some variant of Devanagari for all our languages. There's really little purpose in differentiating between Gurmukhi and Devanagari. And we should just forget about Urdu all together. It made sense when the goal was to teach people one script that gave them access to both Hindustani and the official court languages of Persian/Arabic. But for today, Devanagari is vastly superior for writing any dialect of Hindustani. Similarly, Kannada has an almost bijective mapping to Devanagari, but each symbol in Kannada is just so much more elaborate and a pain to write. Life would be much better with one script.

Bring up changing scripts though and people -- and this appears to be a global phenomenon -- just go nuts! People need to realize that changing scripts isn't a terrible thing and has occurred many times for many languages. We're not going to forget our languages/culture just because we choose to change the symbols we use in writing.

[+] xenadu02|9 years ago|reply
Perhaps someone with a linguistics background can chime in but my (layperson) understanding is that many writing systems start out pictographic and evolved over time, become syllabaries, then eventually alphabets. The Chinese writing system represents an incomplete transition. Egyptian represents a mixed but mostly complete transition, as by the late third period their writing system was mostly based on the sounds of the symbols, not the literal concepts they originally meant (e.g.: using a snake symbol for the "s" sound). Middle period writings are often a mix of all three: some symbols used for their literal meaning, some symbols used to represent syllables, and some for their sound. Sometimes (but not always) there were markings next to the symbols to indicate what "mode" you should read them in.

All YMMV of course, I'm not an expert. I try to avoid being too euro-centric as it is easy for me to claim alphabets are superior because that's what I know, but it does seem like memorizing abstract rules and a small set of letters is easier than memorizing thousands of characters. It is certainly easier to deal with by machine (whether that be a typewriter, a terminal, or designing a modern OpenType font).

[+] warlox|9 years ago|reply
Latin script is only slightly worse than Hangul (which eliminates syllabic ambiguity), but English is a horror. In sane languages using Latin script, new letters are invented or diacritics are used to represent sounds not used in Latin.
[+] warlox|9 years ago|reply
> smartphones are impossible to use if you’re restricted to Chinese characters

This isn't actually true. Handwriting recognition is much more popular than any other means of text input in Hong Kong.