top | item 41775297

(no title)

BoringTimesGang | 1 year ago

>It is issues like this due to which I gave up on C++. There are so many ways to do something and every way is freaking wrong!

These are mostly unicode or linguistics problems.

discuss

order

tralarpa|1 year ago

The fact that the standard library works against you doesn't help (to_lower takes an int, but only kind of works (sometimes) correctly on unsigned char, and wchar_t is implicitly promoted to int).

BoringTimesGang|1 year ago

to_lower is in the std namespace but is actually just part of the C89 standard, meaning it predates both UTF8 and UTF16. Is the alternative that it should be made unusable, and more existing code broken? A modern user has to include one of the c-prefix headers to use it, already hinting to them that 'here be dragons'.

But there are always dragons. It's strings. The mere assumption that they can be transformed int-by-int, irrespective of encoding, is wrong. As is the assumption that a sensible transformation to lower case without error handling exists.