For context, in case anyone needs, that's a common date format in Japan. Aside from using kanji characters, the big surprise to most of the rest of the world is that the largest epoch is specified as a royal era name[1], corresponding to the Japanese monarchy.
This parallels, and the remark about patio11 refers, to this article[2], which has since become famous on HN. It ends with a similar remark from the author's prior experience as an American expatriate in a less populous and less cosmopolitan part of Japan, when a clerk remarked that Patrick McKenzie was a troublesome name to have in Japan, and why didn't he change it to something convenient and ordinary like Tanaka Taro[3].
Microsoft Excel is the worst offender here. When you're on a locale with , as a decimal point it's not able to read CSVs with . as a decimal point. It uses ; instead of , as a field separator.
Delegating parsing user input is a good idea, but sometimes the input methods you can rely on just don't cut it.
By the way: The international way to express a decimal separator is a (thin non-breaking) space. There's no misunderstanding possible.
> By the way: The international way to express a decimal separator is a (thin non-breaking) space. There's no misunderstanding possible.
According to whom? The CGPM recommends thin spaces as thousands separators, and either points or commas as decimal separators. NIST, ISO, etc. generally copy this, sometimes stipulating the decimal separator as one or the other.
This is a relic from the olden times were application data was rarely exchanged across locales, and people expected software to conform to the local conventions (and they largely still expect it). Microsoft never changed this because it would have broken (and would still break) a vast number of systems and workflows.
Take a moment to know that it’s * Crucial* not to localize strings in health software services because it can lead to data leaks and performance degradation. It’s better to work with global APIs so you’re protected from all sorts of risks.
It can and should be, though. I feel like we should have a separate word for parsing when the rules are not well-defined - something like "fuzzy parsing" (in a similar vein to fuzzy string comparison)
Renaming the problem doesn’t make it go away. It might be useful for identifying the subset of parsing which is problematic, but I think the article already achieves this well by specifying the subset of input under discussion.
OT: Why does almost every comment in this thread currently say “2 hours ago”, when they were probably written when this story was first featured, about 3 days ago?
Hovering over the time-ago item on the comment header displays the exact post time, and interestingly it shows times from Feb 16 (3 days ago) for many of the "2 hours ago" comments. Must be an artifact of some moderation tool.
You are trying to apply what you know versus what others know. No different than Farenheit vs Celsius or Yard vs Meter.
Personal, the MM/DD/YYYY format, that is stander in the USA, needs to die and be replaced with YYYY-MM-DD.
Same with 12 hour time and replacing it with 24 hour. As the saying goes l, Americans use am and pm because they can't count past 12. AM and PM are a waste of code and display area. What fits in 2 characters takes up 5 characters.
It depends on the local norms. Where I live, 1,004 is decimal, 1 004 or 1'004 is 1004 which makes it even more clear than the en-US default. That is, the 1.004 variant is never used, and if it is, it is assumed to be a decimal (misspelling) of 1,004.
What I used to do is set the thousands separator to ' in the operating system settings. That made Excel read CSV files with 1,004 and 1.004 the same, as one and four thousands. No one puts thousands separators in CSV files anyway, so that worked out. And it looked nice too.
In today's Windows 11 I can't find that setting. You can't set the thousands separator separately, not anywhere that I can find. It's a tragedy. I see Excel misreading CSV files all the time. I don't use Excel that much myself and I understand what's going on, so it doesn't affect me all that much directly, but for my Excel warrior colleagues, it's another matter.
OptionOfT|1 year ago
Also, I hate DOB selectors which don't allow me to manually enter the date, and default to today, and don't have a year << arrow. Only month.
Now I need to click at least (age - 1) * 12 on the < arrow.
In general, I wish more websites would use native date / number / dropdown pickers.
Workday is the worst offender here.
toast0|1 year ago
1718627440|1 year ago
frizlab|1 year ago
I think that’s true on all OSes
teddyh|1 year ago
(With apologies to patio11.)
RainyDayTmrw|1 year ago
This parallels, and the remark about patio11 refers, to this article[2], which has since become famous on HN. It ends with a similar remark from the author's prior experience as an American expatriate in a less populous and less cosmopolitan part of Japan, when a clerk remarked that Patrick McKenzie was a troublesome name to have in Japan, and why didn't he change it to something convenient and ordinary like Tanaka Taro[3].
This has since become HN folklore.
[1] https://en.wikipedia.org/wiki/Japanese_era_name [2] https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-... [3] https://news.ycombinator.com/item?id=6145768
legulere|1 year ago
Delegating parsing user input is a good idea, but sometimes the input methods you can rely on just don't cut it.
By the way: The international way to express a decimal separator is a (thin non-breaking) space. There's no misunderstanding possible.
LegionMammal978|1 year ago
According to whom? The CGPM recommends thin spaces as thousands separators, and either points or commas as decimal separators. NIST, ISO, etc. generally copy this, sometimes stipulating the decimal separator as one or the other.
layer8|1 year ago
stefs|1 year ago
i really hope you mean thousands separator ...
croes|1 year ago
cytocync|1 year ago
RadiozRadioz|1 year ago
Retr0id|1 year ago
It can and should be, though. I feel like we should have a separate word for parsing when the rules are not well-defined - something like "fuzzy parsing" (in a similar vein to fuzzy string comparison)
eyelidlessness|1 year ago
pwdisswordfishz|1 year ago
xigoi|1 year ago
teddyh|1 year ago
tmiku|1 year ago
unknown|1 year ago
[deleted]
Waterluvian|1 year ago
yndoendo|1 year ago
Personal, the MM/DD/YYYY format, that is stander in the USA, needs to die and be replaced with YYYY-MM-DD.
Same with 12 hour time and replacing it with 24 hour. As the saying goes l, Americans use am and pm because they can't count past 12. AM and PM are a waste of code and display area. What fits in 2 characters takes up 5 characters.
trinix912|1 year ago
OptionOfT|1 year ago
But it can be worse. Los Angeles, Sunday, November 2, 2025, 2:00:00 am is ambiguous. Is it PST or PDT?
munch117|1 year ago
In today's Windows 11 I can't find that setting. You can't set the thousands separator separately, not anywhere that I can find. It's a tragedy. I see Excel misreading CSV files all the time. I don't use Excel that much myself and I understand what's going on, so it doesn't affect me all that much directly, but for my Excel warrior colleagues, it's another matter.
alkonaut|1 year ago
toast0|1 year ago
Which is to say, you assume it's as written, unless context suggests otherwise.
ValleZ|1 year ago