top | item 31605275

How many man years are wasted with western naming convensions?

14 points| drudru | 3 years ago |games.greggman.com

62 comments

order
[+] MauranKilom|3 years ago|reply
Maybe I'm being dense, but I fail to see the point. The code has the exact same sequence of tokens in all cases, just with different names for the constant. The fact that a Ctrl+F doesn't find all related uses seems to be the only argument and it doesn't convince me.

The article does not provide an alternative solution, or a discussion of the upsides of different naming conventions (e.g. knowing that ALL_CAPS is almost surely a compile-time constant). "How many man years are wasted..." - compared to what?

[+] shultays|3 years ago|reply
I think he is complaining about not being able to copy paste GL_MAX_TEXTURE_SIZE to limits.GL_MAX_TEXTURE_SIZE? Otherwise I am failing to see the point as well
[+] geoalchimista|3 years ago|reply
The title seems to imply that this problem does not exist in languages not using Latin scripts. That's not true. Take Japanese for an example, when you have hiragana, katakana, and kanji to express the same idea (and not to mention a kanji typically have more than one way of pronouncing it), the problem is gonna be orders of magnitude more complex.
[+] rhizome31|3 years ago|reply
> Take Japanese for an example, when you have hiragana, katakana, and kanji to express the same idea

I'm sure you know this, but for readers who aren't familiar with Japanese, it might be worth saying that hiragana, katakana and kanji aren't strictly interchangeable. Sure you can use hiragana and katakana as a fallback when you cannot or don't want to use the kanji for various reasons, but normally in every situation there's a recommended script to use in order to write correct Japanese.

[+] avian|3 years ago|reply
> Someone had to translate GL_MAX_TEXTURE_SIZE into mMaxTextureSize.

I fail to see how the fact that the Latin alphabet has upper and lower case is to blame for this. This is more about the coding standards the author has to abide to than anything else.

If your coding standard would require you to name the property MMAXTEXTURESIZE instead of mMaxTextureSize, would there be any less busy work?

[+] shultays|3 years ago|reply
even if his coding standard allows GL_MAX_TEXTURE_SIZE, I still don't see the point. One (original) GL_MAX_TEXTURE_SIZE is a global constant and his limits.GL_MAX_TEXTURE_SIZE is just a member. You will still need a "conversion" code for every constant there is.

    Notice all this busy work
And the code that comes after that is same amount of work, except typing mMaxTextureSize instead of copy pasting GL_MAX_TEXTURE_SIZE perhaps. Is he really complaining about this?

Also is author pushing for everyone should follow one standard? Hard to say what he is asking for.

[+] grose|3 years ago|reply
Japanese has two syllabaries, hiragana and katakana. They represent the same sets of sounds. I think this is very similar to uppercase and lowercase. Of course, the conventions around their usage are different.

If I write a sentence in ALL CAPS, I’M YELLING AT YOU. if i write a sentence in lowercase i am maybe laid back or aloof.

A sentence written in all hiragana might represent a toddler speaking. A sentence written in all katakana might represent a foreigner speaking broken Japanese (ouch!).

Usually, hiragana is used for grammatical purposes, such as particles and conjugations. But for some reason, laws and contracts use katakana instead. In a hypothetical Japanese programming language, which one should they pick? In university I had to learn this strange language called Doolittle[1] which uses Japanese but dodges the grammar particle issue by using spaces and symbols, which I found to be terribly confusing. I would love to see an attempt at a Japanese programming language that is closer to how the spoken language works.

This only scratches the surface. There are even more ways to represent arbitrary syllables, such as with arbitrary kanji (ateji) or even more complex systems like Kanbun. And of course there are many levels of simplification and variation within Chinese characters.

My point is that I don’t think this is inherently a Western thing, even if it ended up that way. It’s interesting to ponder how a non-Western language’s programming conventions could look like. Sometimes it’s even a real issue: see languages like Go where uppercase is semantically meaningful and therefore it is impossible to export Eastern symbols.

[1]: https://en.wikipedia.org/wiki/Dolittle_(programming_language...

[+] kryptiskt|3 years ago|reply
The different casing conventions is a signal indicating what kind of value the identifier represents (macro, local variable, type etc), it's actually doing work and isn't useless embellishment. If case didn't exist to signal those distinctions, other means would have been used (like the m_-prefix for members in some C++ styles). So the existence of upper case and lower case certainly isn't to blame, programmers would be free to use just one if it sufficed for them (and there was a time when a lot of programs consisted of upper case only).
[+] shakow|3 years ago|reply
If we programmed in Chinese, one would need to know thousands of characters to write any decent-sized program; if we programmed in Japanese, one would need to master several alphabets, if we were to program in Hindi, the very shape of our characters would depend of the context...

All in all, the upper/lower case problem does not seem to be that much of a waste of man-years.

[+] thaumasiotes|3 years ago|reply
> If we programmed in Chinese, one would need to know thousands of characters to write any decent-sized program; if we programmed in Japanese, one would need to master several alphabets

Right, because that's the difficult part. Memorizing 100 alphabets is significantly easier than memorizing the relevant Chinese characters. The syllabaries are zero percent of the difficulty of learning to read Japanese.

[+] Arnavion|3 years ago|reply
>if we were to program in Hindi, the very shape of our characters would depend of the context...

Eh? What context-dependent-shape-changing are you referring to? Hindi has a very normal alphabet. There are vowels, consonants, and a bunch of special ligatures for some combinations of those that are technically optional and could be done without. Nothing changes according to "context".

[+] oaiey|3 years ago|reply
Especially considering that English ASCII fits into 128 characters including punctuation ... I do not want to start thinking how Chinese or Japanese fits into that.

In reality the ask is maybe: let us use 25 characters english without uppercase? Which we have in Basic and dozens of old style programming language however typically upper case.

[+] lordnacho|3 years ago|reply
What exactly is the reason people started to have upper and lower case letters? I found a few articles about this appearing in the middle ages, but there doesn't seem to be a reason that makes sense. Seems to add nothing to the language, wtf is the point of the first letter being a capital anyway, and rules about various words that have to have caps?
[+] wanderingstan|3 years ago|reply
Like many things that evolved slowly over time, there isn’t a single clear reason for the upper/lower case distinction.

The distinction began as a mixing of different writing styles (called “hands” in calligraphy), something similar to how we today will mix fonts in a document.

“Lower case” was the newer, more common font. (What wed today call uncial or Carolingian)

“Upper case” was a font based on older, Roman-era designs. This is why Roman monuments like the Trajan column appear to us as being in all capitals.

Using the older “font” at the beginning of sentences seems to have begun as stylistic choice, but perhaps with some purpose as a reading aid to identify sentence starts, somewhat in the same way we’ll use different fonts for headings. Note that punctuation (periods, question marks, commas, etc) was also evolving at around the same time and didn’t exist in Roman times.

Bold and italic also evolved from the mixing of different “fonts” within one text.

Wish I could find a better source, but this is what I’ve learned as a calligrapher over the years.

Some more info here: https://www.babbel.com/en/magazine/history-of-capital-letter...

Edit: the codification of upper/lower case took place in the Italian classical revival during the Renaissance. See: https://www.primidi.com/history_of_western_typography/classi...

[+] tgv|3 years ago|reply
As the other comment says: readability. ALGOL-68 (infamously) allows spaces in identifiers, which makes the code look a bit more natural. C and it's descendants don't, so a logical recourse is using underscores, or --if that's too much typing-- switching capitalization. Pascal was at the forefront of this convention (weirdly enough because the language is case insensitive, I think).

Other conventions were added: e.g., a capital for a type name, m as a prefix for class members, and of course a leading _ as an indication of not being public (which comes from the unix linker/, IIRC).

Anyway, I think the answer to OP's question is a negative number.

[+] myrion|3 years ago|reply
Legibility. It is a significant improvement, and it still bothers me that English underuses it, even after years of it being my primary reading language.
[+] Hamuko|3 years ago|reply
I'd dread to think of what naming conventions would look like if they were modeled after Japanese. You can write the same exact word as ひとごみ, 人ごみ, 人込み, 人混み, 人込, or even ヒトゴミ if you're feeling bold.
[+] lmm|3 years ago|reply
On the other hand, maybe we'd be forced to actually confront those equivalences and make languages that accommodate them.

Research has shown that the lowest hanging fruit in programming language design would be to make names case-insensitive - allowing you to have two variables that differ only in case causes far more problems than it solves. But, programming being a pop culture, no-one dares.

[+] mjevans|3 years ago|reply
Reminds me of Unicode normalization formats, only there's more than 4 of them.
[+] doctor_eval|3 years ago|reply
Would that solve the problem of type/variable name conflicts in Go?
[+] dtech|3 years ago|reply
This article is a load of bollocks. If another language would be the main programming one, there would be other competing conventions resulting in similar problems.
[+] danbruc|3 years ago|reply
If this is really a major pain point in some project, there are probably solutions - macros, reflection, code generation, custom build steps and probably more. Or just change your naming convention.
[+] teknopaul|3 years ago|reply
You need to define the _translation_ from conventions in a way that does not introduce issues. Not just the one convention but how to get to the other and back. HASHTABLE to Hashtable HASH_TABLE to HashTable.

Then codgen tools work properly.

[+] jlnthws|3 years ago|reply
That's why we have conventions: to avoid wasting time arguing about which style to use here and there. I wasted more time reading this article than I will ever waste with lower/upper case naming issues.

By the way, Arabic has 4 letter forms, Japanese has 3 sets of characters, Chinese has 2 character forms, Vietnamese and some other romanized Asian languages have many diacritics (for the tones) which change the meaning of words,... Latin alphabet as it is used in English is by far the simplest of all (to use, teach or learn).

[+] take_it_not|3 years ago|reply
a lot of swes in China use pinyin (which is a romanization system for mandarin Chinese) for naming variables/func/etc

basic vocabulary such as time/test/thread/doc/… might be used along with Pinyin as well, so you will see things like dingdan_time, testJieguo, as a Chinese myself, I found this really hard to understand without a broader context since Chinese is a tonal language..

[+] BlargMcLarg|3 years ago|reply
It's not a problem of the Western language, it's a problem of conventions made to preserve context and assuming the worst, along with some cult stuff.

You could avoid stuff like get/set, but that's pretty low-hanging fruit compared to the explosion of things when you have 3 components referencing from top to bottom, each with a 8+ character variable. Preserving context is expensive in writing and reading. Assuming the reader has a shared context is required to shorten many things, however. This can backfire and make things worse than writing things fully.

Example, you could shorten mMaxTextureSize to maxSize or maxTexture or something. It entirely depends on what the reader knows coming in and what other variables exist or may exist in the future. In the other example, "getCurrentContext()->getLimits()" could be shortened to "Context()->Limits()" assuming there is only context property available is the current one. In both examples, I have to make assumptions on what the reader knows and both the existing and future code which, as mentioned before, can backfire completely.

[+] kkfx|3 years ago|reply
Programming is for humans, that's why we have programming languages instead of writing machine code... Humans have different concept of writing. I think that's perfectly natural to have conventions on style of writing.

Koreans use sillabic bigrams, in the west we use mostly letters in a handful of variation per country and two main alphabet (how many remember that in Europe we have various letters not only in Cyrillic and it's national variations? Like þ, Ð, ȝ, ...), Japanese have even three alphabets (hiragana, katakana and kanji) often mixed for kanji reading aids, ... we have invented the concept of International Auxiliary Languages, with their alphabets, the most well-known is Esperanto but they never took off because pushing their own language give advantage to the successful push-er and so a war at a time we see winners and looser...

Computer time does not count much in that game.

[+] beardyw|3 years ago|reply
I agree to some extent. I often trip myself up with filename and fileName (which is right?)

And then there is the hyphen in CSS for example background-color which in js is backgroundColor. (Colour spelt wrong to me in both cases!)

[+] eimrine|3 years ago|reply
This is so tabs vs spaces holywar. If you can type rather fast you can type any variable without autocomplete no matter how long. If I understood correctly, man years are wasted for more typing only.
[+] gjvc|3 years ago|reply
much less than by poor spelling
[+] shaky-carrousel|3 years ago|reply
Or by having to go and check if something is a constant, a variable or a class. Casing provides metadata about an element.
[+] est31|3 years ago|reply
The way Rust solves this inconsistency issue is by having a bunch of builtin style lints, that 95% of Rust code follows. The key word here is builtin, as in, they existed since 1.0 and are enabled in the compiler by default. You still have to deal with different naming conventions, but they only exist when you interface with some existing component/API.
[+] doctor_eval|3 years ago|reply
I was really hoping this was going to turn into a rant about visibility in Go.

As much as I like Go, I really dislike the use of upper case to make things public. About the only thing I miss about Java is the convention of using capitalised words for types and lowercase for fields, methods and variables.

I must hit a name conflict in Go every second week.

[+] jansan|3 years ago|reply
There is also kebap case, which is used in CSS for example.

    number-input-wrapper { background-color: goldenrod; }
But to be honest, for me those conventions are one of the smallest problems when programming.

Oh, and whoever uses lower case snake case in any circumstance needs a slap on the wrist IMO.

[+] mcv|3 years ago|reply
Isn't the real problem here that people unfamiliar with the Latin alphabet will have to learn the Latin alphabet?

I suppose they could design programming languages using the Chinese alphabet (maybe they already have? I wouldn't know) that use different conventions[0].

[0] Convention is with a 't', by the way.

[+] DiogenesKynikos|3 years ago|reply
Chinese kids already learn the Latin alphabet, even before they learn how to write Chinese characters.

The Latin alphabet is used as a stepping stone towards learning how to write, and as an input method for computers and phones (you type the pronunciation of the character you want to write, and then select from a list of characters that are homophones).