top | item 42121834

(no title)

> The problem stems from the fact that Unicode encodes characters rather than "glyphs," which are the visual representations of the characters. There are four basic traditions for East Asian character shapes: traditional Chinese, simplified Chinese, Japanese, and Korean. While the Han root character may be the same for CJK languages, the glyphs in common use for the same characters may not be. For example, the traditional Chinese glyph for "grass" uses four strokes for the "grass" radical [⺿], whereas the simplified Chinese, Japanese, and Korean glyphs [⺾] use three. But there is only one Unicode point for the grass character (U+8349) [草] regardless of writing system. Another example is the ideograph for "one," which is different in Chinese, Japanese, and Korean. Many people think that the three versions should be encoded differently.

https://en.m.wikipedia.org/wiki/Han_unification

Seems like Wikipedia has a good overview of the issue.

discuss

No comments yet.