In most languages, the top words are self/this and function/def. On the other hand, in Go, the most common word is "err", and "if err != nil" are three of the top four words. I really wonder how big part of the code in Go is just propagating errors.
One day, maybe half a decade from now, Rob Pike will wake up from a nightmare and suddenly realise they got this one wrong. Maybe it'll occur at the same time as the generics one.
There are some was to make error handling much less annoying in go. Since error is just an interface it's pretty easy to make monadic constructs that can carry the error information and let you write pipelined code. IMHO it makes the code much cleaner and easier to read, but a lot of golang enthusiasts haven't really adopted the idea because it's quite different from the established idioms around error handling. Hard-core gophers love them some verbose and stupidly explicit code; but I'm hoping that as the language gets adoption in the wider community the voice of reason will win here.
Lots of uses of "the" in C++ code, massively skewed by giant projects that have the massive section of boilerplate copyright info in a comment at the top of every single file. I didn't realise until I looked at this, but although that's really common to see in C, C++, and Java you barely ever see it in the languages that I spend time in (Rust, Haskell, C#, Python, various flavours of Lisp, JavaScript).
In C# the first word is "summary" which is massively influenced by the way Visual Studio formats comments. I wonder though if this is an indication that a lot of people write comments.
The slow adoption of modern C++ is very apparent. Some things not event listed: forward, unique_ptr, shared_ptr, tuple, constexptr. nullptr is much lower than NULL and move is quite low. These features are now 6+ years old! ;)
Smart pointers realy need a native way to be specified, similar to how we use & to declare references and * for pointers.
Filling your code with shared_ptr<Something> and the likes is just not going to win the majority over, even if it's useful.
This might just be my Python upbringing, but... am I the only one to be troubled by Go's single-letter words? I've always found Go code very hard to read because it isn't self-descriptive at all.
Depends, if it's self evident in the context what it means, like in a method which is declared as (p *player) Attack, p makes perfect sense (to me) rather than typing 'player' everywhere in the function, just like typing i instead of index.
In short, I don't think there is a particular rule that applies, that said my impression is that a lot of variable names in Go code are typically three letters or at least two, like err, buf, src, dst, ok etc.
Go is a language in the almost-forgotten algebraic tradition (loosely descending Fortran → Algol → BCPL → C → Go), in which people prefer E=mc² to MULTIPLY REST-MASS BY SPEED-OF-LIGHT-IN-A-VACUUM BY SPEED-OF-LIGHT-IN-A-VACUUM GIVING ENERGY.
When you have a strong static type system like Go has, you don't need descriptive names that much, because even single letter variable names are evident, because the declarations are there, close to the variable names. It needs a bit of getting used to if you only programmed dynamic or scripting languages.
I am quite surprised that "self" is so much more used than the next word in the list "if". I know many languages where "self" is not a keyword at all, but I cannot think of a single language where "if" is not a keyword.
EDIT: Ops, I missed that there is a language filter. Ignore my comment :)
Besides "return", C/C++ doesn't have a particular word that stands out. Probably because you just write write things. eg you don't put "function" in front of a function.
Cool stats about Java. "import" is the most frequently used word. It looks like everything is already exist in Java, so just import all the things, some clue code and you are done.
I do appreciate this, but word clouds really are a terrible visualization method for text data. With regard to the python example, I cannot grasp at all if the frequency of self and None are similar or drastically different. The table on the value is more informative and less likely to misread.
One can make very interesting conclusions based on purely this. Examples:
- Python developers does not follow Clean Code (ala Uncle Bob) as much as Ruby , because if statement is more frequent than def and return.
- Ruby makes it possible to write in a much more functional style than Python. OR Ruby developers like to develop more in a functional style than Python developers.
- People don't really care about good variable names ("a" is a terrible variable name in scripting languages like JS and Python, still top 11)
- PHP developers might practice "return early" in functions (more return than function keywords) OR their functions just do too much :)
- Ruby makes it possible to write in a much more functional style than Python. OR Ruby developers like to develop more in a functional style than Python developers.
I instinctively think you might be right (at least with your second statement), but what are you using as your metric here?
The Rust compiler seems somewhat overrepresented in this data set. I see "ccx", "fcx", and "CrateContext", which are only used in the Rust compiler itself.
It is probably a sign of a good language that the words used most should be similar in frequency to their use in pseudo code. When words like "end" (Ruby), "self" (Python), "import" (Java), "err"/"error" (Go and Node) are over-represented, it's likely a sign that the language is introducing accidental complexity. By this metric Swift looks astonishingly sane.
Pretty cool! Some unexpected results, or at least not what I guessed. "summary" as the top for C#, "SELECT" all the way down at #43 for SQL, "err" as the top for Go (I'm sure that will spawn some pleasant discussion).
I think for SQL their sample includes just ".sql" files, which tend to contain schema definitions and data dumps, hence CREATE and INSERT. Most of it not handwritten also.
[+] [-] Noughmad|9 years ago|reply
[+] [-] grabcocque|9 years ago|reply
[+] [-] rebeccaskinner|9 years ago|reply
[+] [-] s_kilk|9 years ago|reply
[+] [-] flgr|9 years ago|reply
I now typically deal with this via a Must() function that takes to values and return the non-err value or panics if there was an err.
That means you can call it like con := Must(db.Connect(...)).(db.Connection)
[+] [-] AsyncAwait|9 years ago|reply
[+] [-] oelmekki|9 years ago|reply
[+] [-] z3t4|9 years ago|reply
[+] [-] fagnerbrack|9 years ago|reply
[+] [-] nonsince|9 years ago|reply
[+] [-] adrianN|9 years ago|reply
[+] [-] konsnos|9 years ago|reply
[+] [-] chriswarbo|9 years ago|reply
Made even better by the use of the language's logo as the word clouds shape; the Haskell logo is ">λ=" https://www.haskell.org/static/img/haskell-logo.svg
[+] [-] anvaka|9 years ago|reply
Unfortunately the symbols will not show up, because I'm ignoring them: https://github.com/anvaka/common-words/blob/master/data-extr...
[+] [-] unknown|9 years ago|reply
[deleted]
[+] [-] TuringTest|9 years ago|reply
[+] [-] ddavis|9 years ago|reply
[+] [-] mschuetz|9 years ago|reply
[+] [-] kosma|9 years ago|reply
[+] [-] gribbly|9 years ago|reply
In short, I don't think there is a particular rule that applies, that said my impression is that a lot of variable names in Go code are typically three letters or at least two, like err, buf, src, dst, ok etc.
[+] [-] kps|9 years ago|reply
[+] [-] the_duke|9 years ago|reply
[+] [-] Walkman|9 years ago|reply
[+] [-] dustinmoris|9 years ago|reply
EDIT: Ops, I missed that there is a language filter. Ignore my comment :)
[+] [-] x1798DE|9 years ago|reply
[+] [-] goatlover|9 years ago|reply
[+] [-] RodericDay|9 years ago|reply
[+] [-] ape4|9 years ago|reply
[+] [-] macygray|9 years ago|reply
[+] [-] gravypod|9 years ago|reply
[+] [-] mschuetz|9 years ago|reply
[+] [-] mpjme|9 years ago|reply
[+] [-] WatchDog|9 years ago|reply
[+] [-] chriswarbo|9 years ago|reply
[+] [-] donatj|9 years ago|reply
[+] [-] di4na|9 years ago|reply
[+] [-] cpsempek|9 years ago|reply
[+] [-] enitihas|9 years ago|reply
[+] [-] virtualwhys|9 years ago|reply
I wonder what percentage of `_` usage is value/type discarding in pattern matching and type signatures vs. function application.
Would be nice to somehow ditch `case`:
[+] [-] nrinaudo|9 years ago|reply
[+] [-] Walkman|9 years ago|reply
- Python developers does not follow Clean Code (ala Uncle Bob) as much as Ruby , because if statement is more frequent than def and return.
- Ruby makes it possible to write in a much more functional style than Python. OR Ruby developers like to develop more in a functional style than Python developers.
- People don't really care about good variable names ("a" is a terrible variable name in scripting languages like JS and Python, still top 11)
- PHP developers might practice "return early" in functions (more return than function keywords) OR their functions just do too much :)
[+] [-] brak1|9 years ago|reply
(click them and you can see examples of how each word is being used)
[+] [-] dagw|9 years ago|reply
I instinctively think you might be right (at least with your second statement), but what are you using as your metric here?
[+] [-] Insanity|9 years ago|reply
[+] [-] anvaka|9 years ago|reply
[+] [-] pcwalton|9 years ago|reply
[+] [-] questerzen|9 years ago|reply
[+] [-] realworldview|9 years ago|reply
[+] [-] cven714|9 years ago|reply
[+] [-] pepve|9 years ago|reply