(no title)
kdheepak | 1 year ago
For example, my most recent Julia project has the following line:
windows1252_to_utf8(s) = decode(Vector{UInt8}(String(coalesce(s, ""))), "Windows-1252")
Figuring out that I had to use Windows-1252 (and not Latin1) took a lot more time than I would have liked it to.I get that there's some ergonomic challenges around this in languages like Julia that are optimized for data analysis workflows, but imho all data analysis languages/scripts should be forced to explicitly list encodings/decodings whenever reading/writing a file or default to UTF-8.
samatman|1 year ago
Next time you try to load whoops-weird-encoding.txt as utf-8, and get garbage, may I suggest `file whoops-weird-encoding.txt`? It's pretty good at guessing.
There might be a Julia package which can do that as well. I haven't run into the problem so I have no need to check.