I think the lesson to take from these lists is that covering every edge case for many real-world situations becomes asymptotically difficult.
You can get 95% with no effort at all, 99% with a bit of thought, 99.9% with some effort, 99.99% with a smart design, some research, and a lot of effort, and anything further requires making it your full-time job indefinitely.
It gets thornier when it comes to personal things like names and gender. If you tell a group of people that their identity isn't supported by your system, some people will accept that and work around it, some people will be mad at you, and you'll just have to accept that the latter have a good point. You have to stop developing it at some point, so all you can do is try to cover as many people as possible.
That said, it is incredibly revealing of your personal biases if you design a system that breaks on things like names with "é", or "ñ", or punctuation; or if you implement gender as a boolean that also determines the user's pronouns.
In regards to your last line, I feel like we do need to give people some leeway and not assume bigoted intent. Most of the things we can gig people for not implementing nowadays are only recently important in our cultural consciousness. English has traditionally shied away from using accents and one can reasonably assume that if a person signs up for my English language website, they won't be using Chinese characters in their name. As for the boolean determining pronouns thing, I can easily see someone thinking that was a clever implementation before people adopted the idea that there are more than two to be cognizant of. I also think using they/them is only recently considered grammatically acceptable. Going forward, we do want to support the things you mentioned.
> That said, it is incredibly revealing of your personal biases...if you implement gender as a boolean that also determines the user's pronouns.
It is probably revealing of your personal biases that you cannot imagine systems in languages other than English that need to do some of this since multiple parts of the language, including verbs, are necessarily gendered, with a neutral gender usually only reserved for inanimate things. Reference: I happen to speak one such language.
Problems like this are just hard due to generations of built-up cultural context and the associated effects. We can and should do better, but there is no need to get overly enthusiastic about accusing people of biases while we are at it.
> It gets thornier when it comes to personal things like names and gender. If you tell a group of people that their identity isn't supported by your system, some people will accept that and work around it, some people will be mad at you, and you'll just have to accept that the latter have a good point. You have to stop developing it at some point, so all you can do is try to cover as many people as possible.
That’s a pretty big if, which itself reveals a bias. While it’s certainly not going to cover every single identity, the following will likely cover vastly more than added effort:
1. Do you need to ask for the data at all?
2. If 1, do you need to require the data?
3. If 1, unless you have to comply with arbitrary external requirements (eg law), use a free form text field.
4. Test that the text field and accompanying information is accessible to assistive tools. This usually comes for free if you use built in tools.
5. If not 2, affirmatively indicate that entering anything is optional.
6. Ask for feedback on how you can do it better, again making this obviously optional. If the feedback is meaningful (i.e. “I cannot use this”, rather than “I would prefer this be less inclusive”), accommodate it.
> ...it is incredibly revealing of your personal biases.
People make mistakes. I think you're opening a can of worms if you try to conflate a lack of robustness in user experience accessibility with some underlying anti-inclusivity bias.
> It gets thornier when it comes to personal things like names and gender.
In the case of name, I read an article sometime back that mentioned the problem of getting people's fullname right. Not everyone has a "first name" and a "last name" distinction. So it is better to just have one field of "fullname" and let people enter it as per their locale and tradition instead of having "first name" and "last name" fields. I think gender also falls in the same category.
> ...it is incredibly revealing of your personal biases...
I think that is an unfair accusation.
I've been coding for over 20 years, and I have only had to enable UTF8 to cover all corners of use cases (web based data/content) to special characters. I work with companies in multiple foreign countries, and my biases are irrelevant to technically working systems.
Are you sure most systems even need to record gender or pronouns? Outside of medicine sex is rarely significant, and if by gender you mean masculine or feminine traits then again (other to the person themselves) it’s largely irrelevant I’d have thought.
Looking back at this, yeah, I agree with the criticism that the last line is overly harsh. I try to be lenient to people with unconscious bias, and this doesn't help with that.
This is in no way a list of falsehoods, there is not a single one in the repository. It is a dump of links to other people's articles about falsehoods (or other people's dumps of links to others). I am surprised why they would claim that, maybe they copied another "awesome list" template?
Better title would be, "list of common simplifying assumptions programmers and designers make."
Avoiding all these assumptions will introduce unacceptable complexity for many projects. In many of the listed cases, making the assumption will require a tiny fraction of your addressable market to use a workaround, which they are likely already used to using with other products.
So there is a cost-benefit study to be done here. You have finite engineering time. Which of these assumptions can you avoid with little net increase in complexity? Which ones are actually simplifying? Which ones will cause critical problems for your users? And which ones are too expensive to fix given the return on investment?
I think the content of lists like this are decently useful, but the tone is condescending and unwarranted. I don't "believe" that everyone has a first middle and last name. I do believe that people with fifteen names are capable of leaving out some of the ones in the middle, and it makes my job a lot easier to require them to do so.
People who have the wrong kinds of names pay a heavy price for this kind of thing. You gave an absurd example (15 names), but much more common cases tend to break. For example, my wife's legal name has a hyphen in it. It is astounding how many web sites cannot cope with this. Or worse, accept the correct name in one place and not in another, or accept the entry and then report a name mismatch later when they "sanitize the input".
A normal email address exactly has one @ character. There is an infinite number of possible weird things everyone has little though non-zero chance to encounter on every corner of their life.
> therefore your implementation should allow this
I wouldn't be sure it should. Implementing support for this has direct and indirect, explicit and implicit costs one will have to consider and decide if that's worth it.
In fact, whatever a human can believe about the world, it won't equal what actually is. So we just take decisions about how precise our models have to be to be practical.
You can never finish a program if you insist it has to be perfect and unbreakable. The only reasonable objective is to make sure it does the job 99.9 % of real-life times.
All that said, there are a number of really good email validators out there in the wild.
One of the oldest can be found at http://isemail.info/about, (along with a test suite to write your own implementation and some sage advice about the basic approaches it's reasonable to take when validating email addresses).
Yes, dependencies aren't completely free conceptually, even when they're distributed free of charge, but why write a crappy validator of your own when there are really good ones available?
> A normal email address exactly has one @ character
I've been using <[email protected]@myemaildomain> to respond to people for ages, and this is a completely normal and supported email flow. Who are you to tell other people what is normal? It is not some possible weird thing that other people can use emails that follow the spec and aren't even all that out there.
"The only reasonable objective is to make sure it does the job 99.9 % of real-life times."
If you would be writing software for airplanes, or cars, that would mean lots of crashes per day, with millions x 99.9% being still a lot. If it is financial software, people might loose millions regulary.
But if you make a casual game ... who cares (except the person who lost his highscore achievements).
The main falsehood is that the programmer must adopt, as requirements, all edge cases of every imaginable instance and interpretation of a problem domain.
Even if the programmer wanted to do that, his boss, and boss's boss would disagree.
It is a falsehood, of course, that "I don't have to handle cases outside of the specification we committed to". Though not always, and not for all cases.
Say that my reuqirements are to write a music engraving program for western notation. Of course, I take it for granted that (that type of) music can be written down; my program is not an expression of my personal belief that all music can be written down, or that it has a composer.
"Falsehoods programmers believe about geography: Street adresses contain street names.
In many remote places in Europe, the hamlet name is considered a sufficient address."
Yup, I used to spend my summers on an island where my address would be: farmstead name, island, country. Zip code was optional. Had to find creative ways to satisfy various web forms.
Another that list misses is that houses must be reachable by street at all. My girlfriend 20 years ago had an address on a canal (though the back of her house was reachable by unnamed alleyways).
The thing about postal databases bit me many times being the first owner of a new home, too. So many times some software system didn't believe my address was valid because my builder filed the original deed incorrectly. Actually, this was worse because I couldn't even pay property tax for three years until it was fixed, but believe me software service X that told me my address didn't exist, my house is real and every delivery service I have ever tried to use managed to find it.
Lived in Puerto Rico for a while a few years back, and stuff off the beaten path (and plenty of things in town as well) wouldn't have an "address number", but instead would have a "position" based off kilometer markers.
For a good year, my address was something like "KM 5.4 Carr. 635", which was effectively "Kilometer 5.4 Highway 635". And then the put out new kilometer markers and my address changed by 0.2km in one way according to the power company and the other way according to the water company.
And IIRC, the power company had "yellow house in the back" as part of my address, since there were three houses on this lot. Don't know what would've happened if I painted the house!
Edit: Ha! Looks like a few things have changed in Google Maps and a number of things now have addresses showing up with their km marker numbers, which didn't use to be the case. For example, the Caribbean Cinemas location in Arecibo, Puerto Rico [0] has the address "Carr. #2, Km. 81.0 Barrio Hato Abajo, Arecibo, PR 00612"
One that's repeatedly tripping me up while I'm living on in the Caribbean is "all addresses have ZIP codes". They've started adding postal codes here but not many places have them. It's easy enough to enter 00000 but before moving here I wouldn't have known there are places without some form of post code.
When you onboard to a new company, they will automatically give you access to whatever systems you need to do your job.
Related: When you're hired to be a computer programmer, the company will have a computer for you on your first day of work.
This happened to me twice. The first time at a low-rent startup where people were expected to bring their own machines. (Company folded months later amid an unrelated government investigation.) The second time was when the previous programmer did something unpleasant to the computer on his way out, and the company's procurement process for getting replacement gear required lots of approvals because dev machines are higher-spec and more expensive than an ordinary office computer.
I would say the main takeaway from these lists is: Use a library where the library author has a much bigger vested interest in the edge cases than you do.
Don't implement date and time calculations yourself, just use the library, even for things like "add a day".
Don't parse addresses yourself, use a library. Don't validate email yourself, use a library. Don't validate input yourself, use a library.
Unless the runtime or memory space is absolutely critical to your application, using a library won't make things noticeably worse.
Boss: Why is it taking so long to implement this feature?
Programmer: Well, you see Boss, I found this really cool list of falsehoods that only dumb, less-holy programmers believe on a random Github. It's really cool because it lists every edg...Boss? Where are you going?
Most of these lists (at least the ones I scanned through) are simply assertions, with no explanations, examples, or links. This is less than useful, and makes me doubt their veracity.
I checked a few random lists and their "falsehoods" weren't very informative when it's just a list without any explanation. Maybe they're just supposed to be amusing or contrarian.
- Programming skill is all you need to build great software.
About the only case where I would say this might be true is when your users are other programmers. If the plan is to target a non-technical market, good design and in-depth understanding of the problem space are just as important if not more so.
My favorite example of this is the world's most popular accounting software. Good design and marketing --- mediocre programming.
Not on the 'email' list, but I'll add "A .org domain extension should be the sole criteria for whether a customer can be classified as a nonprofit".
I tried to create a nonprofit account at Box.com a few years ago with a .net address. It took almost 3 weeks of back-and-forth with customer support and multiple interactions with higher-ups in their legal dept. to resolve. No one there had ever considered the possibility.
This kind of thing scares me about the current state of AI. Language models are trained to produce text that looks like the training data, but if the training data has falsehoods they will happily repeat those falsehoods.
I just put the example into GPT3:
Prompt: Can an email address contain more than one @ sign?
Answer: No, an email address can only contain one @ sign.
I'm on my phone so it's a bit laborious to test more
> "Time passes at the same speed on top of a mountain and at the bottom of a valley".
Just curious, does this kind of gravitational time dilation bear any consequence in our systems, given that we are building systems with the assumption that system clocks are not always accurate and there is no global time?
[+] [-] a_shovel|4 years ago|reply
You can get 95% with no effort at all, 99% with a bit of thought, 99.9% with some effort, 99.99% with a smart design, some research, and a lot of effort, and anything further requires making it your full-time job indefinitely.
It gets thornier when it comes to personal things like names and gender. If you tell a group of people that their identity isn't supported by your system, some people will accept that and work around it, some people will be mad at you, and you'll just have to accept that the latter have a good point. You have to stop developing it at some point, so all you can do is try to cover as many people as possible.
That said, it is incredibly revealing of your personal biases if you design a system that breaks on things like names with "é", or "ñ", or punctuation; or if you implement gender as a boolean that also determines the user's pronouns.
[+] [-] BitwiseFool|4 years ago|reply
[+] [-] quadrifoliate|4 years ago|reply
It is probably revealing of your personal biases that you cannot imagine systems in languages other than English that need to do some of this since multiple parts of the language, including verbs, are necessarily gendered, with a neutral gender usually only reserved for inanimate things. Reference: I happen to speak one such language.
Problems like this are just hard due to generations of built-up cultural context and the associated effects. We can and should do better, but there is no need to get overly enthusiastic about accusing people of biases while we are at it.
[+] [-] eyelidlessness|4 years ago|reply
That’s a pretty big if, which itself reveals a bias. While it’s certainly not going to cover every single identity, the following will likely cover vastly more than added effort:
1. Do you need to ask for the data at all?
2. If 1, do you need to require the data?
3. If 1, unless you have to comply with arbitrary external requirements (eg law), use a free form text field.
4. Test that the text field and accompanying information is accessible to assistive tools. This usually comes for free if you use built in tools.
5. If not 2, affirmatively indicate that entering anything is optional.
6. Ask for feedback on how you can do it better, again making this obviously optional. If the feedback is meaningful (i.e. “I cannot use this”, rather than “I would prefer this be less inclusive”), accommodate it.
[+] [-] planetsprite|4 years ago|reply
People make mistakes. I think you're opening a can of worms if you try to conflate a lack of robustness in user experience accessibility with some underlying anti-inclusivity bias.
[+] [-] _448|4 years ago|reply
In the case of name, I read an article sometime back that mentioned the problem of getting people's fullname right. Not everyone has a "first name" and a "last name" distinction. So it is better to just have one field of "fullname" and let people enter it as per their locale and tradition instead of having "first name" and "last name" fields. I think gender also falls in the same category.
[+] [-] RobertRoberts|4 years ago|reply
I think that is an unfair accusation.
I've been coding for over 20 years, and I have only had to enable UTF8 to cover all corners of use cases (web based data/content) to special characters. I work with companies in multiple foreign countries, and my biases are irrelevant to technically working systems.
[+] [-] implements|4 years ago|reply
[+] [-] a_shovel|4 years ago|reply
[+] [-] matsemann|4 years ago|reply
No, these "awesome" lists are seldom curated. They are just dumps of hundreds or thousands of links.
[+] [-] layer8|4 years ago|reply
[+] [-] remram|4 years ago|reply
This is in no way a list of falsehoods, there is not a single one in the repository. It is a dump of links to other people's articles about falsehoods (or other people's dumps of links to others). I am surprised why they would claim that, maybe they copied another "awesome list" template?
[+] [-] unknown|4 years ago|reply
[deleted]
[+] [-] unknown|4 years ago|reply
[deleted]
[+] [-] asdfasgasdgasdg|4 years ago|reply
Avoiding all these assumptions will introduce unacceptable complexity for many projects. In many of the listed cases, making the assumption will require a tiny fraction of your addressable market to use a workaround, which they are likely already used to using with other products.
So there is a cost-benefit study to be done here. You have finite engineering time. Which of these assumptions can you avoid with little net increase in complexity? Which ones are actually simplifying? Which ones will cause critical problems for your users? And which ones are too expensive to fix given the return on investment?
I think the content of lists like this are decently useful, but the tone is condescending and unwarranted. I don't "believe" that everyone has a first middle and last name. I do believe that people with fifteen names are capable of leaving out some of the ones in the middle, and it makes my job a lot easier to require them to do so.
[+] [-] not2b|4 years ago|reply
[+] [-] qwerty456127|4 years ago|reply
A normal email address exactly has one @ character. There is an infinite number of possible weird things everyone has little though non-zero chance to encounter on every corner of their life.
> therefore your implementation should allow this
I wouldn't be sure it should. Implementing support for this has direct and indirect, explicit and implicit costs one will have to consider and decide if that's worth it.
In fact, whatever a human can believe about the world, it won't equal what actually is. So we just take decisions about how precise our models have to be to be practical.
You can never finish a program if you insist it has to be perfect and unbreakable. The only reasonable objective is to make sure it does the job 99.9 % of real-life times.
[+] [-] NateEag|4 years ago|reply
One of the oldest can be found at http://isemail.info/about, (along with a test suite to write your own implementation and some sage advice about the basic approaches it's reasonable to take when validating email addresses).
Yes, dependencies aren't completely free conceptually, even when they're distributed free of charge, but why write a crappy validator of your own when there are really good ones available?
[+] [-] krageon|4 years ago|reply
I've been using <[email protected]@myemaildomain> to respond to people for ages, and this is a completely normal and supported email flow. Who are you to tell other people what is normal? It is not some possible weird thing that other people can use emails that follow the spec and aren't even all that out there.
[+] [-] Shorel|4 years ago|reply
Can you provide us with an example?
[+] [-] hutzlibu|4 years ago|reply
If you would be writing software for airplanes, or cars, that would mean lots of crashes per day, with millions x 99.9% being still a lot. If it is financial software, people might loose millions regulary.
But if you make a casual game ... who cares (except the person who lost his highscore achievements).
Obviously there are different standards.
[+] [-] kazinator|4 years ago|reply
Even if the programmer wanted to do that, his boss, and boss's boss would disagree.
It is a falsehood, of course, that "I don't have to handle cases outside of the specification we committed to". Though not always, and not for all cases.
Say that my reuqirements are to write a music engraving program for western notation. Of course, I take it for granted that (that type of) music can be written down; my program is not an expression of my personal belief that all music can be written down, or that it has a composer.
[+] [-] unknown|4 years ago|reply
[deleted]
[+] [-] ihateolives|4 years ago|reply
Yup, I used to spend my summers on an island where my address would be: farmstead name, island, country. Zip code was optional. Had to find creative ways to satisfy various web forms.
[+] [-] nonameiguess|4 years ago|reply
The thing about postal databases bit me many times being the first owner of a new home, too. So many times some software system didn't believe my address was valid because my builder filed the original deed incorrectly. Actually, this was worse because I couldn't even pay property tax for three years until it was fixed, but believe me software service X that told me my address didn't exist, my house is real and every delivery service I have ever tried to use managed to find it.
[+] [-] ajford|4 years ago|reply
For a good year, my address was something like "KM 5.4 Carr. 635", which was effectively "Kilometer 5.4 Highway 635". And then the put out new kilometer markers and my address changed by 0.2km in one way according to the power company and the other way according to the water company.
And IIRC, the power company had "yellow house in the back" as part of my address, since there were three houses on this lot. Don't know what would've happened if I painted the house!
Edit: Ha! Looks like a few things have changed in Google Maps and a number of things now have addresses showing up with their km marker numbers, which didn't use to be the case. For example, the Caribbean Cinemas location in Arecibo, Puerto Rico [0] has the address "Carr. #2, Km. 81.0 Barrio Hato Abajo, Arecibo, PR 00612"
[0] https://caribbeancinemas.com/theater/arecibo/
[+] [-] UglyToad|4 years ago|reply
[+] [-] boredumb|4 years ago|reply
[+] [-] ballenf|4 years ago|reply
When you onboard to a new company, they will automatically give you access to whatever systems you need to do your job.
or
It will be automatic or easy to get access to all the systems you need to do your job.
[+] [-] reaperducer|4 years ago|reply
Related: When you're hired to be a computer programmer, the company will have a computer for you on your first day of work.
This happened to me twice. The first time at a low-rent startup where people were expected to bring their own machines. (Company folded months later amid an unrelated government investigation.) The second time was when the previous programmer did something unpleasant to the computer on his way out, and the company's procurement process for getting replacement gear required lots of approvals because dev machines are higher-spec and more expensive than an ordinary office computer.
[+] [-] michaelcampbell|4 years ago|reply
When you onboard to a new company, there will be a list of the systems you'll need to do your job.
Once you know what tools you need, you'll have training, and documentation on them. This is especially true for internally written tools.
[+] [-] throwaway889900|4 years ago|reply
You will be onboarded to your new company.
[+] [-] jedberg|4 years ago|reply
Don't implement date and time calculations yourself, just use the library, even for things like "add a day".
Don't parse addresses yourself, use a library. Don't validate email yourself, use a library. Don't validate input yourself, use a library.
Unless the runtime or memory space is absolutely critical to your application, using a library won't make things noticeably worse.
[+] [-] nofunsir|4 years ago|reply
[+] [-] allears|4 years ago|reply
[+] [-] gregfjohnson|4 years ago|reply
- The compiler won’t have bugs.
- The compiler won’t have show-stopping bugs.
- The compiler’s bugs will be fixed quickly.
- The compiler’s bugs can be fixed.
But I think they should have added:
- My code does not work because of bugs in the compiler..
[+] [-] remram|4 years ago|reply
> Because the bugs are in their code, not mine.
[+] [-] biasedestimate|4 years ago|reply
[+] [-] allendoerfer|4 years ago|reply
Bookmarking poorly organized lists will make you avoid all their pitfalls.
Also:
Lists from the Internet are correct.
[+] [-] y42|4 years ago|reply
[+] [-] drivers99|4 years ago|reply
(Following my own advice...) For example: https://chiselapp.com/user/ttmrichter/repository/gng/doc/tru...
> 10× programmers exist.
(i.e. saying they don't exist) Maybe a writing prompt for a blog post, but there's nothing else there about it.
[+] [-] jqpabc123|4 years ago|reply
- Programming skill is all you need to build great software.
About the only case where I would say this might be true is when your users are other programmers. If the plan is to target a non-technical market, good design and in-depth understanding of the problem space are just as important if not more so.
My favorite example of this is the world's most popular accounting software. Good design and marketing --- mediocre programming.
[+] [-] daphneokeefe|4 years ago|reply
[+] [-] Hussell|4 years ago|reply
Partial list:
Falsehoods about Airline Seat Maps https://duffel.com/blog/falsehoods-about-seat-maps
Falsehoods about Biometrics https://shkspr.mobi/blog/2021/01/falsehoods-programmers-beli...
Falsehoods about Plain Text https://jeremyhussell.blogspot.com/2017/11/falsehoods-progra...
[+] [-] Hussell|4 years ago|reply
Falsehoods about Names https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-...
Falsehoods about Time https://infiniteundo.com/post/25326999628/falsehoods-program...
Falsehoods about Addresses https://www.mjt.me.uk/posts/falsehoods-programmers-believe-a...
[+] [-] glitchc|4 years ago|reply
[+] [-] Optimal_Persona|4 years ago|reply
I tried to create a nonprofit account at Box.com a few years ago with a .net address. It took almost 3 weeks of back-and-forth with customer support and multiple interactions with higher-ups in their legal dept. to resolve. No one there had ever considered the possibility.
[+] [-] kmod|4 years ago|reply
I just put the example into GPT3: Prompt: Can an email address contain more than one @ sign? Answer: No, an email address can only contain one @ sign.
I'm on my phone so it's a bit laborious to test more
[+] [-] hintymad|4 years ago|reply
> "Time passes at the same speed on top of a mountain and at the bottom of a valley".
Just curious, does this kind of gravitational time dilation bear any consequence in our systems, given that we are building systems with the assumption that system clocks are not always accurate and there is no global time?
[+] [-] SuperCuber|4 years ago|reply