top | item 39813731

(no title)

patal | 1 year ago

1. "graphic representation of writing systems" and "text" mean the same thing to me. Do you mean text as spoken?

2. I think the pronunciation should not be encoded into the text representation on a general scale. You would need different encodings for "though" and "through" in english alone. Your example leaves the meaning open, even if being read as text. If I was the editor, and the distinction was important, I'd change it to "For example, the cyrillic letter 'c'".

I understand that Unicode provides different code points for same-looking characters, mostly because of history, where these characters came from different code sheets in language-specific encodings.

discuss

order

pavel_lishin|1 year ago

I mean text as in the platonic ideal of "c" and "с". Just because they look the same, does not make them the same character. If we're going to be encoding characters that happen to have pixel-identical renderings in certain fonts, the next logical step is to encode identical letters that look different in different fonts or writing styles as separate code points as well - for example, the English letter "g" is a fucking orthographic nightmare.

kps|1 year ago

Imagine if, say, English people normally wrote an open ‘g’ and French normally wrote a looped ‘g’, and you have the essence of the Han Unification debates.