(no title)
emergie | 5 years ago
dz - \u0064\u007a, 2 basic latin block codepoints
DZ - \u0044\u005a
Dz - \u0044\u007a
dz - \u01f3, lowercase, single codepoint
DZ - \u01f1, uppercase
Dz - \u01f2, TITLECASE!
What happens if you try to express dż or dź from polish orthography?You can use
dż - \u0064\u017c - d followed by 'LATIN SMALL LETTER Z WITH DOT ABOVE'
dż - \u0064\u007a\u0307 - d followed by z, followed by combining diacritical dot above
dż - \u01f3\u0307 - dz with combining diacritical dot above
multiplied by uppercase and titlecase forms
In polish orthography dz digraph is considered 2 letters, despite being only one sound (głoska). I'm not so sure about macedonian orthography, they might count it as one thing.Medieval ß is a letter/ligature that was created from ſʒ - that is a long s and a tailed z. In other words it is a form of 'sz' digraph. Contemporarily it is used only in german orthography.
How long is ß?
By some rules uppercasing ß yields SS or SZ. Should uppercasing or titlecasing operations change length of a string?
saurik|5 years ago