top | item 22894642

(no title)

DougGwyn | 5 years ago

I thought it was strictly one character per 32-bit code. Anyway, whatever it is called it is what wchar_t should be.

discuss

order

a1369209993|5 years ago

There are no fixed width encodings with range of encodable characters anywhere near that of Unicode.

flatfinger|5 years ago

It's too bad Unicode wasn't designed around the concept of easily-recognizable grapheme clusters and "write-only" [non-round-trip] forms that are normalized in various ways. A text layout engine shouldn't have to have detailed knowledge of rules that are constantly subject to change, but if there were a standard representation for a Unicode string where all grapheme clusters are marked and everything is listed in left-to-right order, and an OS function was available to convert a Unicode string into such a form, a text-layout using that OS routine would be able to accommodate future additions to the character set and and glyph-joining rules without having to know anything about them.