top | item 44362822

(no title)

Yeah, unfortunately I feel like despite all the advances in Unicode tech, my modern terminal (MacOS) still bugs out badly with emojis and certain special characters.

I'm not sure how/when codepoints matter for wcwidth: my terminal handles many characters with more than one codepoint in UTF-8, like é and even Arabic characters, just fine.

discuss

o11c|8 months ago

`wcwidth` works by assigning all codepoints (strictly, code units of whatever size `wchar_t` is on your system, but thankfully modern Unixen are sane) a width of -1 (error), 0 (combining), 1 (narrow), or 2 (wide).

`wcswidth` could in theory work across multiple codepoints, but its API is braindead and cannot deal with partial errors.

This is all from the perspective of what the application expects to output. What the terminal itself does might be something completely different - decomposed Hangul in particular tends to lead to rendering glitches in curses-based terminal programs.

This is also different from what the (monospace) font expects to be rendered as. At least it has the excuse of not being able to call the system's `wcwidth`.

Note that it is always a mistake to call an implementation of `wcwidth` other than the one provided by the OS, since that introduces additional mismatches, unless you are using a better API that calculates bounds rather than an exact width. I posted an oversimplified sketch (e.g. it doesn't include versioning) of that algorithm a while back ...

https://news.ycombinator.com/item?id=43851532

PhilipRoman|8 months ago

As fallback, you can also just emit the character and see how far the cursor advanced via CSI 6n (try printf '\x1b[6n')