"Never abbreviate a variable" is a very strong statement, surely inspiring religious wars. And maybe there are edge cases: i as a loop counter, id for an integer primary key, whatever. But this example is something else entirely; "ttpfe" is honestly the worst variable name I have every seen.
Historically, i wasn't even an abbreviation: It was the first variable name which would be assumed to be integer by FORTRAN compilers which implicitly assigned types to variables based on name. The choice was probably further influenced by longstanding mathematical tradition, which uses i and j as indices.
(You could declare types and the compiler would respect it, leading to the old truism "GOD is REAL, unless declared INTEGER".)
(If you think that's the weirdest thing old FORTRAN did, look up the arithmetic IF statement sometime. Then, look up assigned GOTO.)
When hiring we ask for sample code from developers written specifically for the job application. We place a really high premium on readability. Occasionally we'll get code from devs that is very concise - not just abbreviated variable names, but complex long statements using ternary etc.
I'll usually ask them to resubmit code and really focus on readability and it turns out fine. But I think there might be a misconception among devs where the thought is that really compact code shows talent.
Our ideal coder is someone who cares deeply about performance and where the code is ridiculously easy to read and trace through. I mention performance, because in some situations making things more verbose can affect that.
Whenever we interview a candidate, the one thing we give high premium on is structure, the rest come next.
bad variable names, long complex code...can easily be adjusted. If the candidate doesn't have a good understanding of how to structure a code (what goes where, for what purpose, separation on concern...), it'll take much more effort to teach that candidate than to teach the bad variable namer.
Properly structured code naturally leads to a better performing software, even if/when a bug arises it'll be easier to spot / test.
All that being said, if a coder names a variables, 'a', 'aa', 'aaa', 'aaaa' that's a serious red flag.
The older I get, the less I value "cleverness" in programmers. I used to delight in writing code golf one-liners. But now I realize it detracts from solving the hard problems, and it makes simple problems harder.
In the 90's I worked on a FoxPro application, also for DoD. My boss, a retired Navy Captain, used very terse variable names, partly because he was a hunt-and-peck typist, and partly because earlier languages allowed only two-character names.
But FoxPro allowed variable names of any length. However, it only recognized the first ten characters. To my boss's chagrin, I often used names longer than ten characters, risking collisions with other long names. That never happened.
But ten years after leaving, I was hired to port the product to Visual FoxPro, which does recognize the whole name. Some of the early commits were "Fixed inconsistently-used long variable name..."
Of course, in those days we weren't using linters, or tests, or source control, or reproducible builds... and yet still had a business. No wonder Alan Kay calls computing "not quite a field."
I remember in college in the mid 80s trying to get something to work on one of the Apple IIs in the library in BASIC, only to discover that my earlier experience with the disk based interpreter supported 6 char names, but the ROM based interpreter only recognized the first 2 chars, but would accept longer names it would not differentiate.
Hilarity ensues...
And, yeah, I also did a lot of XBase in the late 80s, though usually "Clipper", rather than "Fox".
Batch files made a decent build process, and you could be disciplined with regular arc/zip files for source - not too unlike "make clean" and svn. We certainly had a lot of cruddy manual processes otherwise back then, though.
I was recently implementing a geometry algorithm which I looked up on quora. It was described using typical vector notation, using r,s . t,u. Since I referenced the algorithm in the comments, I decided to use these same variable names in my code.
I think this is the right choice, but my code reviewer didn't. But he didn't click on the quora link.
Why is okay for mathemeticians to abbreviate things but programmers? Is it because they deal in more abstract entities where the name is irrelevant?
In math, notations are designed to make statements about the problem domain concise. Once you pass a certain degree of concision, longer names impede readability rather than enhancing it. That is because the ability to take in an entire complex expression or subexpression at a glance tells you things—and lets you see patterns—that wouldn't be as apparent if longer names were used. Programmers in the APL tradition understand this, but most programmers do not. (Many refuse to believe it's possible when they hear about it!)
In software, programmers have grown accustomed to a notion of readability that derives from large, complicated codebases where unless you have constantly repeated reminders of what is going on at the lowest levels (i.e. long descriptive names) there is no hope of understanding the program. In such a system, long descriptive names are the breadcrumbs without which you would be lost in the forest. But that is not true of all software; rather, it's an artifact of the irregularity and complexity of most large systems. It's far less true of concise programs that are regular and well-defined in their macro structure.
In the latter kind of system, there's a different tradeoff: macro-readability (the ability to take in complex expressions or subprograms at a glance) becomes possible, and it turns out to be more valuable than micro-readability (spelling out everything at the lowest levels with long names).
It also turns out that consistent naming conventions give you back most of what you lose by trading away micro-readability, and consistent naming conventions are possible in small, dense codebases. That of course is also how math is written: without consistent naming conventions and symmetries carefully chosen and enforced, mathematical writing would be less intelligible.
Edit: The fact that readability without descriptive names is widely thought to be impossible is probably because of how little progress we've made so far in developing good notations, and tools for developing good notations, in software. This may not be so hard to understand: it took many centuries to develop the standard mathematical notations and good ways of inventing new ones to suit new problems. Mathematics is the most advanced culture we have in this respect, and in computing we're arguably still just beginning to retrace those steps. If we wrote math the way we write software, mathematics as we know it wouldn't be possible.
Edit 2: The best thing on this is Whitehead's astonishingly sophisticated 1911 piece on the importance of good notation: http://introtologic.info/AboutLogicsite/whitehead%20Good%20N.... If you read it and translate what he's saying to programming, you can glimpse a form of software that would make what people today call "readable code" seem as primitive as mathematics before the advent of decimal numbers seems to us. The descriptive names that people today consider necessary for good code are examples of what Whitehead calls "operations of thought"—laborious mental operations that consume too much of our limited brainpower—which he contrasts to good notations that "relieve the brain of unnecessary work".
Applying Whitehead's argument to software suggests that we'll need to let go of descriptive names at the lowest levels in order to write more powerful programs than we can write today. But that doesn't mean writing software like we do now, only without descriptive names; it means developing better notations that let us do without them. Such a breakthrough will probably come from some weird margin, not from mainstream work in software, for the same reason that commerce done in Roman numerals didn't produce decimal numbers.
> "Is it because they deal in more abstract entities where the name is irrelevant?"
Partially, but also because math equations don't really have strong maintainability needs. Another mathematician isn't going to walk in 6 months later, get confused, and blow up the universe.
Though I'd argue that in fact math equations should be better named. They're rather abstractly named as a matter of convention, but they would be more easily understandable in many cases if variables were more carefully named.
Another risk of "porting" geometry algorithms, especially complex ones, directly from their mathematical expressions, is that you don't gain insight into why they work. This makes debugging later difficult, since nobody who wrote the code actually understood what the algorithm was doing. Forcing yourself to rename variables into something sensible will also force you to understand the mechanics of what's happening.
I find that as a piece of code skews more math- and geometric- centric, the variable names skew more towards math-like brevity as well. To some degree this is due to historical technical limitations (LAPACK and Fortran being one extreme -- dgbsv anyone?), but I see a ton of contemporary production code with one letter variable names. R is a rotation, n is a normal, p is momentum, whatever. I think a lot of the historical notation conventions carry over to code, and most peeps working in the domain day to day are down with whatever baggage is brought along. This is fine until you try to read francophone code... try reading a French a thesis and you'll find that they spur all conventions that the rest of the world has agreed upon :).
>Why is okay for mathemeticians to abbreviate things but programmers? Is it because they deal in more abstract entities where the name is irrelevant?
for example, xyz^2 in a piece of written math means something different than it would in a program (in the math case we are obviously not taking the variable "xyz" and squaring it). I guess what I am trying to say with this example is that perhaps variable names in math are one symbol because concatenation, in many cases, already means "multiply".
If the variables had been named timeOfTenPercentHeightOnTheFallingEdge and timeOfTenPercentHeightOnTheRisingEdge it probably would have still been hard to notice that they had been swapped in that one line.
I think the catch is that the 'f' and 'r' keys are right next to each-other, so if you accidentally type one instead of the other then you'd get the other variable by mistake.
That said, you raise a fair point - we don't know how this bug got there. If it was a simple typo like above then the verbose names would have prevented it. If it was a logic error by the programmer (For whatever reason), then you're right that the name wouldn't matter because they typed the one they intended, it just wasn't the right one to use.
Yes, but the likelihood of mistyping them would be far less. 'f' and 'r' are right next to each other. 'Falling' and 'Rising' are a bit harder to unnoticeably fat-finger.
I definitely prefer more verbose names to abbreviated ones, but I'm not sure that never abbreviating a variable name is the right way to go either. Surely there's a middle ground between `Ttpfe` and `timeOfTenPercentHeightOnTheFallingEdge`?
I like using static types to avoid these sorts of problems. Modern languages like Swift, Rust and Haskell let you make zero-overhead type wrappers around other types.
So here they could have defined `newtype RisingEdge(Float)` and `newtype FallingEdge(Float)`, and then use those types in the function parameters as appropriate.
Use context. If you can't, refactor so you can. If you can't, well, you're SOL and stuck with timeOfTenPercentHeightOnTheFallingEdge.
Edit: e.g. if all of your variables are "timeOfTenPercentHeight...", cut that part out of the name. Full names can be just as bad as abbreviations - e.g. if they're all "timeOfTenPercentHeight..." then they all start to blend together.
One thing I often see elided in these naming wars is scope: am I the only person who gives longer names to globally visible things than to locals?
For example:
size_t *find_foo(const char *source_text)
{
const char *src = source_text;
for(; src != 'f'; src++)
... do stuff
... do more stuff
return src - source_text;
}
Now the meaning of "source_text" would not be evident, except for the name. But just glancing at the usage shows that "src" is clearly a working cursor into the source text.
But if I called it "working_cursor" would that really explain anything to the reader? If anything, giving a detailed name risks misleading readers in much the same way as stale comments can mislead.
The problem was not that the variable was abbreviated. The problem was that the abbreviated variable was so similar to another abbreviated variable that was used for a similar purpose.
I would add - there's a bit of a difference between an abbreviation that just keeps something from being ungodly verbose (20 letters instead of 50), and an abbreviation that shortens something so much that the original meaning is completely lost ("Ttpfe"). This is especially true for things in context - "time of ten percent height on the falling edge" is needlessly verbose, but in context "ten_percent_falling_edge" would probably be perfectly fine. And indeed, that's really just an expanded version of "Ttpfe", which is what was used anyway.
If you shorten everything to five-letter names, it's not really that surprising that it becomes an issue - And I mean, what's the real point of abbreviations when they are so short that it makes it harder to read the code?
Edit: It's worth pointing out though - if this is really old C code it may be justifiable. Back in the olden days of C (Older then C89 at least), only the first 8 characters of a symbol actually mattered. "blahblahone" and "blahblahtwo" would resolve to the same symbol. So shortening in this way could be necessary.
Yeah, the problem isn't that the airplane is on fire, the problem is that it crashed into another burning airplane as both tumbled uncontrolled through the sky, hurdling toward the earth.
I say this tongue in cheek: With current IDEs having such great autocompletion, has anyone experimented with coding far outside of the ASCII character set? Programming language restrictions aside, I recognize the obvious troubles this would cause. And yet I do a lot of work on simulations where math formulas are converted to code, which means lots of compounded Greek names like "omegaSquared" or "epsilonMinus". Naming decisions becomes more challenging as subscripts and superscripts are added, yet alone matrix indices. At some point perhaps the symbolic name should be replaced with the descriptive name, such as "first_eccentricity_flat_to_fourth". But it sure would be nice to have access to something with such brevity.
I self-tested variable name lengths on my own code.
Three letters is enough to avoid most collisions. Words do not make sense yet.
At four letters most words become decipherable given an appropriate encoding.
At five letters a two word phrase may make sense.
I make a rough decision based on variable scope - shorter lifetime means shorter variable name, but I rarely go with just one letter as it reduces uniqueness.
If I need to use a really long phrase frequently I take a mathematician's approach and alias it to an abstract and highly unique symbol. The phrase may still exist in the addressed data structure, I just avoid it within the algorithm. Mathy code also has a tendency to encourage numbered variables, e.g. "x0, x1, x2".
Heh. When I first learned to code as a kid, a variable was max two characters (the first of which had to be a letter A-Z and the other could be a letter or number).
I would suggest that the need for Englishy variable names is due to a weakness in programming languages and possibly the programming model itself. Why should a set of legitimate values for a computation benefit from how you refer to that set? Can that variable take on undesired values? Do you rely on that name and its comprehensibility to distinguish good from bad values? I sometimes find it hard to believe we still program this way.
We don't have to still program this way - you can write code with very strict types, with machine-checked proofs that it works correctly, etc, etc. We don't do this very often because it turns out this level of rigor is incredibly time-intensive.
While many of the Bell Labs guys didn't much like Pascal, Turbo Pascal managed to address almost all of the complaints I've ever seen, while preserving the good parts of Pascal (or Modula???).
Java must look irredeemable to the Bell Labs folks, though. Perhaps it is: UglyNames; limited structured constant literals; still too clunky lambdas for callbacks.
This is common in the sciences as well, since senior professors also had the 8-character limit (from fortran [0]). And functions are also named this way as well (see lapack/blas).
Some people also hate typing out slightly longer variable names and what not. I try to emphasize that a section of code will tend to be read more times that it is written, and therefore readability is more important. It's a frustrating battle sometimes, though.
[0] Exacerbating the problem of using the wrong variable is the fact that much existing code uses implicit types...
TL:DR Abbreviated variables are not always intuitive to others. I tend to agree. If you're going to use a pattern or abbreviation then be as non-creative as possible.
[+] [-] benkuykendall|9 years ago|reply
[+] [-] dgcoffman|9 years ago|reply
[+] [-] sengstrom|9 years ago|reply
[+] [-] other_herbert|9 years ago|reply
[+] [-] cbd1984|9 years ago|reply
(You could declare types and the compiler would respect it, leading to the old truism "GOD is REAL, unless declared INTEGER".)
(If you think that's the weirdest thing old FORTRAN did, look up the arithmetic IF statement sometime. Then, look up assigned GOTO.)
[+] [-] mmaunder|9 years ago|reply
I'll usually ask them to resubmit code and really focus on readability and it turns out fine. But I think there might be a misconception among devs where the thought is that really compact code shows talent.
Our ideal coder is someone who cares deeply about performance and where the code is ridiculously easy to read and trace through. I mention performance, because in some situations making things more verbose can affect that.
[+] [-] component|9 years ago|reply
bad variable names, long complex code...can easily be adjusted. If the candidate doesn't have a good understanding of how to structure a code (what goes where, for what purpose, separation on concern...), it'll take much more effort to teach that candidate than to teach the bad variable namer.
Properly structured code naturally leads to a better performing software, even if/when a bug arises it'll be easier to spot / test.
All that being said, if a coder names a variables, 'a', 'aa', 'aaa', 'aaaa' that's a serious red flag.
[+] [-] new_hackers|9 years ago|reply
[+] [-] knocte|9 years ago|reply
[+] [-] gavinpc|9 years ago|reply
In the 90's I worked on a FoxPro application, also for DoD. My boss, a retired Navy Captain, used very terse variable names, partly because he was a hunt-and-peck typist, and partly because earlier languages allowed only two-character names.
But FoxPro allowed variable names of any length. However, it only recognized the first ten characters. To my boss's chagrin, I often used names longer than ten characters, risking collisions with other long names. That never happened.
But ten years after leaving, I was hired to port the product to Visual FoxPro, which does recognize the whole name. Some of the early commits were "Fixed inconsistently-used long variable name..."
Of course, in those days we weren't using linters, or tests, or source control, or reproducible builds... and yet still had a business. No wonder Alan Kay calls computing "not quite a field."
[+] [-] Roboprog|9 years ago|reply
Hilarity ensues...
And, yeah, I also did a lot of XBase in the late 80s, though usually "Clipper", rather than "Fox".
Batch files made a decent build process, and you could be disciplined with regular arc/zip files for source - not too unlike "make clean" and svn. We certainly had a lot of cruddy manual processes otherwise back then, though.
[+] [-] foota|9 years ago|reply
[+] [-] stevep98|9 years ago|reply
I think this is the right choice, but my code reviewer didn't. But he didn't click on the quora link.
Why is okay for mathemeticians to abbreviate things but programmers? Is it because they deal in more abstract entities where the name is irrelevant?
[+] [-] dang|9 years ago|reply
In software, programmers have grown accustomed to a notion of readability that derives from large, complicated codebases where unless you have constantly repeated reminders of what is going on at the lowest levels (i.e. long descriptive names) there is no hope of understanding the program. In such a system, long descriptive names are the breadcrumbs without which you would be lost in the forest. But that is not true of all software; rather, it's an artifact of the irregularity and complexity of most large systems. It's far less true of concise programs that are regular and well-defined in their macro structure.
In the latter kind of system, there's a different tradeoff: macro-readability (the ability to take in complex expressions or subprograms at a glance) becomes possible, and it turns out to be more valuable than micro-readability (spelling out everything at the lowest levels with long names).
It also turns out that consistent naming conventions give you back most of what you lose by trading away micro-readability, and consistent naming conventions are possible in small, dense codebases. That of course is also how math is written: without consistent naming conventions and symmetries carefully chosen and enforced, mathematical writing would be less intelligible.
Edit: The fact that readability without descriptive names is widely thought to be impossible is probably because of how little progress we've made so far in developing good notations, and tools for developing good notations, in software. This may not be so hard to understand: it took many centuries to develop the standard mathematical notations and good ways of inventing new ones to suit new problems. Mathematics is the most advanced culture we have in this respect, and in computing we're arguably still just beginning to retrace those steps. If we wrote math the way we write software, mathematics as we know it wouldn't be possible.
Edit 2: The best thing on this is Whitehead's astonishingly sophisticated 1911 piece on the importance of good notation: http://introtologic.info/AboutLogicsite/whitehead%20Good%20N.... If you read it and translate what he's saying to programming, you can glimpse a form of software that would make what people today call "readable code" seem as primitive as mathematics before the advent of decimal numbers seems to us. The descriptive names that people today consider necessary for good code are examples of what Whitehead calls "operations of thought"—laborious mental operations that consume too much of our limited brainpower—which he contrasts to good notations that "relieve the brain of unnecessary work".
Applying Whitehead's argument to software suggests that we'll need to let go of descriptive names at the lowest levels in order to write more powerful programs than we can write today. But that doesn't mean writing software like we do now, only without descriptive names; it means developing better notations that let us do without them. Such a breakthrough will probably come from some weird margin, not from mainstream work in software, for the same reason that commerce done in Roman numerals didn't produce decimal numbers.
[+] [-] potatolicious|9 years ago|reply
Partially, but also because math equations don't really have strong maintainability needs. Another mathematician isn't going to walk in 6 months later, get confused, and blow up the universe.
Though I'd argue that in fact math equations should be better named. They're rather abstractly named as a matter of convention, but they would be more easily understandable in many cases if variables were more carefully named.
Another risk of "porting" geometry algorithms, especially complex ones, directly from their mathematical expressions, is that you don't gain insight into why they work. This makes debugging later difficult, since nobody who wrote the code actually understood what the algorithm was doing. Forcing yourself to rename variables into something sensible will also force you to understand the mechanics of what's happening.
[+] [-] bhollan|9 years ago|reply
[+] [-] santaclaus|9 years ago|reply
[+] [-] NobleSir|9 years ago|reply
for example, xyz^2 in a piece of written math means something different than it would in a program (in the math case we are obviously not taking the variable "xyz" and squaring it). I guess what I am trying to say with this example is that perhaps variable names in math are one symbol because concatenation, in many cases, already means "multiply".
[+] [-] guelo|9 years ago|reply
[+] [-] DSMan195276|9 years ago|reply
That said, you raise a fair point - we don't know how this bug got there. If it was a simple typo like above then the verbose names would have prevented it. If it was a logic error by the programmer (For whatever reason), then you're right that the name wouldn't matter because they typed the one they intended, it just wasn't the right one to use.
[+] [-] niccaluim|9 years ago|reply
[+] [-] saghm|9 years ago|reply
[+] [-] pshc|9 years ago|reply
I like using static types to avoid these sorts of problems. Modern languages like Swift, Rust and Haskell let you make zero-overhead type wrappers around other types.
So here they could have defined `newtype RisingEdge(Float)` and `newtype FallingEdge(Float)`, and then use those types in the function parameters as appropriate.
Helps shorten function names to boot!
[+] [-] sbov|9 years ago|reply
Edit: e.g. if all of your variables are "timeOfTenPercentHeight...", cut that part out of the name. Full names can be just as bad as abbreviations - e.g. if they're all "timeOfTenPercentHeight..." then they all start to blend together.
[+] [-] adrianratnapala|9 years ago|reply
For example:
Now the meaning of "source_text" would not be evident, except for the name. But just glancing at the usage shows that "src" is clearly a working cursor into the source text.But if I called it "working_cursor" would that really explain anything to the reader? If anything, giving a detailed name risks misleading readers in much the same way as stale comments can mislead.
[+] [-] twblalock|9 years ago|reply
[+] [-] DSMan195276|9 years ago|reply
If you shorten everything to five-letter names, it's not really that surprising that it becomes an issue - And I mean, what's the real point of abbreviations when they are so short that it makes it harder to read the code?
Edit: It's worth pointing out though - if this is really old C code it may be justifiable. Back in the olden days of C (Older then C89 at least), only the first 8 characters of a symbol actually mattered. "blahblahone" and "blahblahtwo" would resolve to the same symbol. So shortening in this way could be necessary.
[+] [-] bryanrasmussen|9 years ago|reply
[+] [-] dgcoffman|9 years ago|reply
[+] [-] reactor|9 years ago|reply
[+] [-] syphilis2|9 years ago|reply
[+] [-] new_hackers|9 years ago|reply
[+] [-] buzzybee|9 years ago|reply
Three letters is enough to avoid most collisions. Words do not make sense yet. At four letters most words become decipherable given an appropriate encoding. At five letters a two word phrase may make sense.
I make a rough decision based on variable scope - shorter lifetime means shorter variable name, but I rarely go with just one letter as it reduces uniqueness.
If I need to use a really long phrase frequently I take a mathematician's approach and alias it to an abstract and highly unique symbol. The phrase may still exist in the addressed data structure, I just avoid it within the algorithm. Mathy code also has a tendency to encourage numbered variables, e.g. "x0, x1, x2".
[+] [-] rileymat2|9 years ago|reply
[+] [-] tehchromic|9 years ago|reply
[+] [-] bhollan|9 years ago|reply
[+] [-] foota|9 years ago|reply
[+] [-] bhollan|9 years ago|reply
[+] [-] mmaunder|9 years ago|reply
[+] [-] bhollan|9 years ago|reply
[+] [-] ams6110|9 years ago|reply
[+] [-] ericssmith|9 years ago|reply
[+] [-] correnos|9 years ago|reply
[+] [-] TheDong|9 years ago|reply
I think that his variable names section is utterly ridiculous, but it's a relevant read from a relatively prolific person, so worth sharing.
[+] [-] Roboprog|9 years ago|reply
While many of the Bell Labs guys didn't much like Pascal, Turbo Pascal managed to address almost all of the complaints I've ever seen, while preserving the good parts of Pascal (or Modula???).
Java must look irredeemable to the Bell Labs folks, though. Perhaps it is: UglyNames; limited structured constant literals; still too clunky lambdas for callbacks.
[+] [-] sseagull|9 years ago|reply
Some people also hate typing out slightly longer variable names and what not. I try to emphasize that a section of code will tend to be read more times that it is written, and therefore readability is more important. It's a frustrating battle sometimes, though.
[0] Exacerbating the problem of using the wrong variable is the fact that much existing code uses implicit types...
[+] [-] mchahn|9 years ago|reply
The same cognitive mistake could have happened with it spelled out. Two concepts closely related can easily be confused.
[+] [-] iblaine|9 years ago|reply
[+] [-] hartror|9 years ago|reply
[+] [-] foota|9 years ago|reply