(no title)
ubernostrum | 7 years ago
Also, if you think you can decompose without allocating memory... well, try a code point like U+FDFA.
For reference, its decomposition is:
U+0635 U+0644 U+0649 U+0020 U+0627 U+0644 U+0644 U+0647 U+0020 U+0639 U+0644 U+064A U+0647 U+0020 U+0648 U+0633 U+0644 U+0645
(and that doesn't begin to touch any of the potential issues with variant forms, homoglyph attacks, etc.)
cryptonector|7 years ago
This is actually implemented in ZFS. (And also character-at-a-time normalization for hashing.)
I don't see how homoglyphs enter the picture. Can you explain?