top | item 44486079

(no title)

panpog | 7 months ago

Can you fit everything into 32 bits? I have no idea, but Hangul and indict scripts seem like they might have a combinatoric explosion of infrequently used characters.

discuss

order

eviks|7 months ago

But they don't have that explosion if you only encode the combinatoric primitives those characters are made of and then use composing rules?

panpog|7 months ago

You still get the combinatoric explosion, but you have more bits to work with. Imagine if you could combine any 9 jamo into a single hangul syllable block. (The real combinatorics is more complicated, and I don't know if it's this bad.) Encoding just the 24 jamo and a a control character requires 25 codepoints. Giving each syllable block its own codepoint would require 24^9>2^32 codepoints.