What does code readability mean? (2018)

[+] CobrastanJorji|5 years ago|reply

> Should we strive to satisfy the Shakespeare for Dummies demographic?

A thousand times, yes!

There is code out there so elegant and pure that, after painstakingly pulling apart its pieces and analyzing each choice of word like a Robert Frost poem, we're left feeling like we've seen a face of God.

When debugging in the middle of the night because an alarm went off or a customer is angry, T. S. Elliot is absolutely the last thing I want to see. When I want to maintain a large system quickly, I want to be treated like I'm 5. I don't want Shakespeare. I don't even want Shakespeare for Dummies. I want "Dick and Jane Process a Customer Order."

I absolutely agree with the author that when I say "This code is unreadable," I actually mean "I haven’t spent enough time trying to read," but time is something I don't have. My coworkers and I will likely read the same code many times. We only have so much time to spend reading code. Code that takes a lot of time to read is bad. We do not have time to keep studying the code until it is no longer unreadable.

We are more than willing to invest quite a lot of additional time writing the code to make the time it takes to read the code as low as conceivably possible. Heck, with a small handful of exceptions, I will happily trade quite a bit of performance for simplicity.

Sure, if you're Terje Mathisen, and you're writing Quake 3, and Polyhymnia or Velvel Kahan descend from Mt. Helicon singing of inverse square roots and the number 0x5f3759df, and you happen to be trying to optimize a 3D game engine, sure, write poetry. Ages and ages hence, we shall read of your algorithm by the fireside and slowly smile as a blog explains to us why it is brilliant. Sing on! But if you're writing a normal program that's gonna be maintained frequently by normal people, you call security. That muse does not have a badge. She does not work here. There is a clear "no children of Mnemosyne" sign in the lobby.

[+] kpmah|5 years ago|reply

But, in the general case, who is this lowest common denominator programmer you are writing for? I can probably work out a few lines of terse Haskell more quickly than a page or two of boilerplate Java, because that's what I'm familiar with. Readable code is about pattern recognition, and everybody has a different set of patterns.

"Shakespeare for Dummies" works because English speakers tend to have a shared cultural familiarity that you can write for, but programmers typically don't.

That's not to say style isn't important. I'd say consistency is more important than any notion of simplicity, because once the reader has recognised a pattern they can re-recognise it quickly.

[+] matusp|5 years ago|reply

This. The problem with the Shakespeare argument is that authors compares code to artistic writing. But code should be compared to technical writing. When you write technical analysis or documentation, you are expected to create easily understood document that is tailored for the reader. Even if you really enjoy Joyce, you usually do not do your technical writing in Finnegans Wake style. That would be a terrible practice.

It is similar with code. People are writing obscure code for various reasons (trying to break the language, code golfing, etc.), but when you write code for a project that is to be maintained in the future, you should use the "formal" style of coding. This also means using the best practices and patterns that other people are accustomed to so they can be more effective.

[+] trentnix|5 years ago|reply

Preach it. The less cognitive load required to dive in and make changes, the better.

The young version of me wanted every opportunity to show off, but the grizzled veteran I am today is in love with simplicity and clarity.

[+] bluGill|5 years ago|reply

Have you read dick and Jane? I have, 150 pages to say "Sally is upset because her toy sailboat got left outside in the rain, and she wants dick to go get it". When I'm looking at a 3am level bug I don't want to wade through that mess, and odds are that the more complex rendering I gave. The complex rendering might have even "used the word storm above" and averted the entire 3am crisis.

Dick and Jane has a place for my 5 year old. But we expect 5 year olds to grow up to better.

[+] throwaway_pdp09|5 years ago|reply

Truly. Illegibility as a programmer sport is a horrible and costly thing to pick up after.

[+] z3t4|5 years ago|reply

Also you should not need to press "go to definition" 10 times in a row...

[+] null_object|5 years ago|reply

I’m sure I recall seeing a graph showing the development of a programmer over time, showing how their code starts out as simple but scarcely readable, gradually gets better but then descends for a period into excessive abstraction and unreadability as they learn new and exciting aspects of the language and can’t resist the temptation to show-off this abstruse knowledge. Before climbing back into simple elegance and understandability as their self-confidence and assurance in the language gradually subsumes the need to be ‘clever’ and gnostic in their coding.

I’m amused the author chose Shakespeare as an example of ‘difficult’ writing. Shakespeare is eminently readable at all levels: his skill was precisely that his compelling stories can be read for their exciting narrative by anyone, and equally can be mined for deeper meaning by those that are interested in unraveling his layers of metaphor and symbolism.

[+] goto11|5 years ago|reply

Shakespeare is difficult to read because many words have changed meaning over time. So unless you are very careful when reading, you will think you understand, but actually misunderstand a lot.

Some of the puns does not even make sense anymore because the pronunciation have changed. You basically have to be a specialized linguist to understand them all.

[+] aszen|5 years ago|reply

Code readability is just one of the goals we aim for while designing software, performance is another, stability is another, flexibility and ability to make changes is yet another.

While reading any code I tend to think about these tradeoffs and see if that hard to read code is actually hard for a reason or simply a mess because no one took care of it.

Writing dumb easy looking duplicate code is not optimal either, I don't see an issue with stable, fast and efficient code that's harder to understand yet works brilliantly. One can easily increase readability of such code by adding comments but one can't get performance, stability and flexibility by writing comments.

Optimising for dummies just because code is easier to debug isn't a good idea, often times I have to debug easy looking code only to find that it's not that easy in execution as it was in reading. Simple code often doesn't handle all the edge cases, abstraction less code is harder to maintain and change, easy code can't evolve and grow with business needs.

It's only when code is unreadable, buggy and slow that it starts to become a real mess.

Also a shout out to Rich Hickey's distinction between easy and simple, creating simple code requires a lot more effort and thought and is not easy to just read and modify by dummies.

[+] kungtotte|5 years ago|reply

I think a lot of the time people think they have performance issues when they in fact don't, likewise with stability and flexibility.

In most cases, writing the readable and straightforward version first and only moving to the less readable but more X version (for any given value of X) after it's evident that you need to is the optimal solution.

It's the programming equivalent to buying cheap tools first and only buying the expensive version once the cheap one breaks: If it breaks you used it enough to warrant the expensive and more durable one, and if it didn't break you didn't have to spend more money than necessary.

[+] m_mueller|5 years ago|reply

I’m missing a bit of prescriptiveness here. Here’s a formula I’ve arrived at after 15y of academic and professional programming - sharing here with the purpose of having it challenged by the reader:

* every abstraction has a cost - DO a quick cost/benefit analysis in your head, or even better in the code comments, when choosing for/against an abstraction.

* generally, start with KISS. - a process begins with a verb and its design usually starts with being a function/procedure - pieces of data that belong together across multiple processes should start out as an immutable object or struct. In python I’m almost never using direct descendants of object, rather I inherit either from NamedTuple (for final classes) or I decorate with @dataclass(frozen=True) for non-final.

* using the above, you’ll start seeing violations of DRY. DO use the rule-of-threes. If there’s a second repetition you’re either writing, or already anticipate based on the JIRA backlog - factor out the repeating code using an appropriate abstraction.

I’ve found that code written with these rules tends to be simple yet clean (these two properties can often be opposite, I’ve seen simple code that is constantly repeating itself and thus becomes unmaintainable, and I’ve also seen clean code that is the opposite of simple because of an abundance of abstractions (think FizzBuzzEnterprise).

[+] slx26|5 years ago|reply

Yes, abstractions make symbolic sense for humans, but in a technical context they are most often problematic, because intuition and technical correctness rarely align. Not like they are on opposite sides, but intuition almost always has a lot of technical holes, and when you are coding, something always ends up surfacing from those holes... and it's never pleasant. In this sense, focusing on the data and simple rules is most likely to bring us closer to the true technical problem that we are trying to solve. Abstractions and organization and good names are still vital because we do need an intuitive vision of the system as a whole, but the technical processes need to be strictly correct, not just intuitive.

[+] fsnowdin|5 years ago|reply

That FizzBuzzEnterprise thing is hilarious. I can't believe I didn't know about it until now.

[+] TeMPOraL|5 years ago|reply

> When we observe this tendency in other (non-programming) contexts we may interpret it as laziness or short attention span. When we react this way to code we blame the code and the original programmer.

Strongly agreed. Particularly about a lot of examples of the so-called "clever code".

Programming is a profession. You can't expect to forever coast on what little knowledge landed you your first job. You're supposed to learn and improve over time. That means learning to understand more complex code, architectures and programming paradigms.

I can't read Haskell code at all. But I don't claim it's "clever" or "unreadable". I realize the code is probably fine, it's me who needs to pick up a book and learn the language.

There's rampant anti-intellectualism and (the bad kind of) laziness in programmer circles, thinly veiled as concern for efficiency ("less clever code = easier to debug", or "helps juniors contribute", as if the job of a junior wasn't - in big part - to be learning to become a senior); I find it just being penny wise, pound foolish. Lambda expressions[0], regular expressions, pattern matching, Lisp macros - this is not clever code, these are tools that can greatly reduce complexity and improve readability. They just require spending a few hours or days with a book, every now and then.

--

[0] - Yes, really; few jobs back, just after we transitioned to Java 8, I was told by my boss to maybe refrain from using lambda expressions for the sake of "more junior" people. Right, because polluting code with anonymous classes is easier on juniors than (event) -> { few lines; } in some GUI event handler.

[+] azaza123|5 years ago|reply

The problem of the article is the same as the problem of the general discussion on how to program better: quality of arguments. Instead of conducting scientific studies on code readability, programmers keep on sharing their opinions which they support with anecdotes or citations of Shakespeare. There is so much for the community to learn about...

[+] aszen|5 years ago|reply

I think the whole point of the article was that code readability is not a scientific phenomenon you can measure by conducting studies.

How readable the code is, depends on who's reading, what's their background and what style do they prefer for expressing logic.

[+] virgilp|5 years ago|reply

> “Good code is simple” doesn’t actually say anything.

This is why I love Rich Hickey's "simple made easy" talk, because with it you an make the distinction between "simple" and "easy".

"Good code is simple" absolutely says something and has an objective meaning, at least for some people.

[+] chrisma0|5 years ago|reply

There is a recent study (to be published at ESEM'20) on an empirical validation of "Cognitive Complexity" as a measure of source code understandability: https://arxiv.org/abs/2007.12520

The authors state in the conclusion: "The metric correlates with the time it takes a developer to understand source code, with a combination of time and correctness, and with subjective ratings of understandability."

[+] 29athrowaway|5 years ago|reply

This does not acknowledge that code styles can be compared in objective ways.

What is more readable?

    # Expression 1
    1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1

    # Expression 2
    10

One requires to do 9 arithmetic operations in your head, perhaps more than once if you lost count, the other one doesn't.

You can compare code in terms of length, nesting, cyclomatic complexity, number of negations, number of inputs, etc. It's not just subjective bikeshedding.

[+] bluGill|5 years ago|reply

1 is simpler in most cases because I trust the computer to get the answer, but I need to know all the explicit parts. 2 hides where it all came from for an answer which might be wrong.

[+] goto11|5 years ago|reply

What is the code intended to communicate? It we don't know the intention we cant say which example most clearly communicates this intention.

[+] aceBacker|5 years ago|reply

I think the point is that we develop in a stream of consciousness. We figure out what we need, pull it in, do what we need to do as it pops into our thoughts. When we're done it's time to start writing. Take that stream of consciousness and make it more logical. Easier to follow. Refined.

That's how authors write books. No one writes a first draft and declared the book finished. Every creative effort requires iteration to improve.

[+] unknown|5 years ago|reply

[deleted]

[+] diegolo|5 years ago|reply

there is this quote from Clean Code by uncle Bob that I simply love.

'Avoid mental mapping: In general programmers are pretty smart people. Smart people sometimes like to show off their smarts by demonstrating their mental juggling abilities. After all, if you can reliably remember that r is the lower-cased version of the url with the host and scheme removed, then you must clearly be very smart.'

[+] fsnowdin|5 years ago|reply

I just finished a re-factoring project recently. This article really made me re-think everything I did.

[+] jondubois|5 years ago|reply

[deleted]

[+] nendroid|5 years ago|reply

>By analogy, plenty of people find reading Homer, Shakespeare, or Nabokov difficult and challenging, but we don’t say “Macbeth is unreadable.” We understand that the problem lies with the reader.

Why does the responsibility Have to be solely on the reader? There's plenty of code out there that's unreadable because of the coder, how is this outside of the realm of possibility? Why is all the onus and bias on the reader?

For example, readable:

    measurementOfLeftBottomSideOfBox = 200;

versus unreadable (an acronym of Left Bottom Side):

    LBB = 200;

Just like English, programming languages rely on the talent writer and on the abilities of the reader At the same Time.

The best code is code written by a talented programmer who can make the code readable to All people of All skill levels.

One thing people get confused about is readability and elegance. Obviously "measurementOfLeftBottomSideOfBox" is readable but not elegant. While "LBB" is certainly elegant but not readable. My philosophy is readability over elegance, but you will find many programmers are unaware of this dichotomy and have a strict subconscious aversion to writing something ugly like "measurementOfLeftBottomSideOfBox."

This aversion leads to more unreadability than necessary. It's some subconscious thing in our minds that makes us code this way but when you think about it.... there's no logical point in it at all. Aim to encode as much context as possible into your code because it's completely irrelevant how ugly the variable appears.

[+] valenterry|5 years ago|reply

> The best code is code written by a talented programmer who can make the code readable to All people of All skill levels.

Unfortunately this is not true. Code is like language. You can make it readable for everyone by using the most easy to comprehend and least ambiguous language. But doing so decreases efficiency for both "advanced" writers and readers. "Advanced" grammar and vocabulary exists for a reason - it allows to express thoughts more succinct and concise and can decrease the time to understand something in great level of detail and context by magnitudes. But it requires knowledge and shared context between writer and reader.

It's the same for using a very efficient compression algorithm vs. plaintext. You cannot have the advantages of both at the same time. Pick your poison.

(I assume you understand "pick your poison", a non-native reader might not. It's so nice and concise, isn't it? :)

[+] userbinator|5 years ago|reply

I can just as well argue that long rambling variable names are unreadable, because they obscure the macro-level structure of the code and the continuous repetition creates extra cognitive load ("is this really the same variable as the other one? It looks like the first three words are the same, but...") especially when one tries to keep track of several of these huge names and follow the dataflow.

Also, left and bottom together refers to a point, not a side; so I would be doubly perplexed upon encountering such a name.

[+] ZephyrBlu|5 years ago|reply

> My philosophy is readability over elegance, but you will find many programmers are unaware of this dichotomy and have a strict subconscious aversion to writing something ugly like "measurementOfLeftBottomSideOfBox."

I strongly believe that naming like "measurementOfLeftBottomSideOfBox" is not that helpful or readable.

A name like that implies that there are measurements for each side of this box, so following that naming scheme we would have at least:

    measurementOfLeftBottomSideOfBox
    measurementOfRightBottomSideOfBox
    measurementOfLeftTopSideOfBox
    measurementOfRightTopSideOfBox

Look at how many much useless text we have here. "measurementOf" and "SideOfBox" add nothing but clutter to the naming, and writing out practically the same thing 4 times suggests we could abstract this into a data structure.

I know I'm being overly pedantic in this case, but I think the sentiment behind this type of naming commits a few sins:

1) It's overly verbose. More than 3 words is a warning sign to me.

2) It's specific rather than generic. For instance if I name a function "sortSheepByHoofSize", it implies the reader know what hoofs are, cares about them and knows how to measure them. Whereas when naming it "sortSheep" or perhaps even just "sort", it's immediately understandable on a surface level to practically everyone.

3) Following on from 2), this type of naming lacks context. We should leverage the context of surrounding code and abstractions to make naming understandable, instead of trying to pack all the meaning into one name. Oftentimes there's repeated information in names that could be inferred from context instead.

[+] psychoslave|5 years ago|reply

Well, BCC actually seems quite the total opposite of elegance, according to my own sense of elegance. Elegance is far more subjective. Take a panel of some equally experimented programmers, and ask them to assess some code on its readability and elegance. I would expect the latter to vary far more greatly.

Thus said I would find the other example an ominous sign on the code architecture. I would be more at ease to find a line like :

box.edge(3).span.set(200)

Or any syntactic variation of such statement.

It's not the variable name per se, it's what it conveys about the abstractions used in the code base.

[+] alfonsodev|5 years ago|reply

but wouldn't be 'measurementOfLeftBottomSideOfBox' a smell, of data being too flat ?

box.measures.sides['left-bootom']

vs

measurementOfLeftBottomSideOfBox

I know your point is elegant vs readable which I agree I just want to note another dimension that could be flat vs well structured data.

[+] bluGill|5 years ago|reply

Change english to Korean and all the variables names to the Korean translation and read your reply again. To someone who is fluent in Korean there is no problem, but to the rest of us English speakers (who don't know Korean) the long names with weird symbols are even less readable than lbb.

I have in my day job some code we got from Korea. I'm told by those who have spent the time to understand it that it is good code. But since it was written in Korean it is a level harder to figure out.

72 comments