top | item 36873371

(no title)

derobert | 2 years ago

Some unarchivers (especially ones which aren't Unix natives, like say rar) seem to love to do it. They're just buggy, of course.

Beyond that, before the mid-2000s, it was common to use non-UTF-8 locales on Linux. So I'm sure I still have ISO-8859-1/—15 encoded file names somewhere, especially in archived data. They're not always trivial to rename either, because there might be references to them by name. (Or, in odd cases, you can't convert the name to UTF-8 because you hit a filename length limit, since UTF-8 is more bytes).

I believe wanting to access data from 20 years ago is a perfectly reasonable use case.

It's not so bad if a program can't display the file name right, as long as it doesn't crash with an exception or refuse to open the file. Unix file names have been defined as arbitrary sequences of octects except / and NUL for 30+ years.

discuss

No comments yet.