I remember seeing a Github bot a couple weeks ago that strips out whitespace and adds a .gitignore file to a repo (I also remember this really rubbing some people the wrong way). This search indicates that it would probably be useful to have a linter bot running on Github for all the popular languages. It would find syntax errors, common mispellings, and compilation issues, and then submit pull requests to fix the issues.
I have no time to work on something like this myself, but I'm sure a lot of people would find it useful, especially if it acted as a "first defense" before deployment. Curious what other HN'ers think about this.
Perhaps it would be more productive to advocate the use of local pre-commit hooks. Git makes it very easy to configure validation locally long before anything gets sent to Github.
Would be nice if Github provided better documentation and a selection of validation templates to include in new projects. This would better leverage the power of Git and its distributed nature than a bot running on Github.
I actually just replied with a suggestion that Github should implement some built-in functionality for simple error-checking. I think it's a great idea. It would be really helpful if you got a simple little list of notifications for a commit indicating possible error points.
I noticed another problem is that the highlighter will select the language name as well as the term which means <?php and (c) Copyright are shown instead of the actual mistake.
C# I can understand being so low, since it's almost always written in Visual Studio or MonoDevelop (both of which provide autocompletion). But how is JavaScript the next lowest?
Odd that compiled languages (C, C++, Java) are higher than some interpreted languages (PHP, Javascript). Of course, the search will match comments as well as code, so it may just mean they have better comments.
For the record, Github's search index is wayyy out of date sometimes. The second user here is me and I deleted that user like two years ago: http://cl.ly/0y271f0T3G0X2J1L022E
Same here, I contacted support two times in two years and they said "we're working on it". Obviously, that isn't true and they just don't care about the outdated search index.
In JavaScript, if you check for a non-existent property on a variable (e.g. aVar.lenght vs aVar.length) it will return "undefined". So people often rely on this behaviour to check if something is an array or not (no comment on whether this is good or bad), with:
if(somethingThatMightBeAnArray.length){
// do things with array
}
So misspelling of length can be making a lot of code out there behave in an unexpected way.
In a static language this would be flagged as an error. I assume something less than ideal happens in languages such as Ruby.
I once worked at a company where a very early piece of code had a typo "properites" instead of "properties". This misspelling became institutionalized, and was used throughout the codebase because it was deemed too expensive to fix. And this was with a static language (with good IDE refactoring support)!
I had this problem as a junior dev when my english was weaker. The problem stems from that 'height' is spelled with 'ht', but width with 'th'. Since one often write those words in conjunction, it is easy to mix the endings up. If you're then a non-native speaker and don't run spellcheck on your code, you might end up writing 'lenght' and 'heigth' quite a few times, I know I did :)
My experience is more with languages that are typically compiled and would report this error as an error fairly early on, so the coder would correct it long before checking the code in.
What's the trade-off by having "undefined" returned instead of having an error reported as soon as the code is loaded?
It prevents you from later defining a 'lenght' method and using it at runtime without a recompile.
For core methods like 'length', it seems silly to think that you'd want to redefine it. And indeed, it's usually counterproductive - that's why any experienced JavaScript dev will have coding conventions like "Don't muck with the prototypes of built-in objects."
But at the application layer, this can be really useful. Imagine you're adding a new field to a message deep in the storage system, and then you want to pass that along to a template in the rendered HTML. It's really useful to be able to do this without recompiling & restarting each individual server between the backend and the frontend, and just edit a few template files and have them automatically pick up any changes to backend data formats.
Ditto adding a new database column, if you're using an RDBMS - it's pretty handy to have your model objects instantly reflect the new field, instead of needing to manually add accessors to each of your model classes. Rails and Django are built on this principle.
Also, you have a versioning problem with statically-compiled code in a distributed system. Imagine that you add this new 'lenght' field to a backend message, and add it to the frontend, and they both compile & deploy. Now imagine that a message from an old backend hits a new frontend (it's not possible to upgrade a whole distributed system at once without downtime). What does the new frontend do with it? It needs a piece of data, but the backend had no idea that it had to provide that piece of data. The only thing it can do is return the equivalent of 'undefined'.
In C++/Java code, you usually deal with these by inventing frameworks. Google code, for example, is littered with
if (msg.has_new_field()) {
run_long_complicated_ui_display_routine(msg.new_field());
} else {
fall_back_to_old_behavior(msg.old_field());
}
checks. If you use a more dynamic language like Python, you can use language mechanisms to represent undefined values or fields that are defined at runtime. If you use a static language, you're stuck mimicking them with hashmaps and null.
Whether your language is compiled is not the issue, it's how you model objects and calling methods on them. In smalltalk and other languages that take a message passing approach doing a.b() sends a message "b" to object a, and the object can do anything it likes with that.
Now the normal (and optimized) route is to find the method on a’s method table and then call that, but if a doesn't have that method then a second method may be called to allow this to be handled. Once you have that sort of mechanism you can make ORM libraries that dynamically examine a schematic at run time and generate accessor methods only as they are needed, decorators, proxies and many other patterns become wonderfully simple, and there are often many more opportunities for meta-programming at run time.
The downside is of course that it becomes harder to find errors when writing or compiling, but tight integration of your development environment with your runtime can help with this.
This reminds me of a US company I worked with that outsourced some of their service layer work to a company with heavy European influence. As a result, API methods also had the spelling of certain words eg. getColour() or getFavourites(). Good times.
In the LaTex editor that I'm using (WinEdt), I have a custom color highlighting that marks \rigth and \heigth in red+bold+strikeout, so I don't have to wait to compile and see a strange error to spot the mistake.
It'd be great if Github would scan your code for errors like these and just let you know they exist (in case you didn't want them to, which I would assume you wouldn't for the most part).
[+] [-] theli0nheart|14 years ago|reply
I have no time to work on something like this myself, but I'm sure a lot of people would find it useful, especially if it acted as a "first defense" before deployment. Curious what other HN'ers think about this.
[+] [-] Mizza|14 years ago|reply
https://github.com/Miserlou/WhitespaceBot
Feel free to fork it to do whatever you want, that's why I made it.
[+] [-] tux1968|14 years ago|reply
Would be nice if Github provided better documentation and a selection of validation templates to include in new projects. This would better leverage the power of Git and its distributed nature than a bot running on Github.
[+] [-] andrewcamel|14 years ago|reply
[+] [-] zeratul|14 years ago|reply
[+] [-] jond3k|14 years ago|reply
I put together a GitHub Illiteracy Index script https://github.com/jond3k/sandbox/tree/master/github-illiter... which you can play around with if you like :D
[+] [-] koenigdavidmj|14 years ago|reply
[+] [-] cleaver|14 years ago|reply
Also fun to search on "functino".
[+] [-] rorrr|14 years ago|reply
[+] [-] josegonzalez|14 years ago|reply
[+] [-] eik3_de|14 years ago|reply
[+] [-] jond3k|14 years ago|reply
And you thought this would end up being a PHP joke...
https://github.com/jond3k/sandbox/tree/master/github-illiter...
[+] [-] xcud|14 years ago|reply
[+] [-] timdorr|14 years ago|reply
[+] [-] alpb|14 years ago|reply
[+] [-] angrycoder|14 years ago|reply
[+] [-] cpr|14 years ago|reply
[+] [-] southern|14 years ago|reply
[+] [-] amirhhz|14 years ago|reply
[+] [-] kaffeinecoma|14 years ago|reply
I once worked at a company where a very early piece of code had a typo "properites" instead of "properties". This misspelling became institutionalized, and was used throughout the codebase because it was deemed too expensive to fix. And this was with a static language (with good IDE refactoring support)!
[+] [-] strictfp|14 years ago|reply
[+] [-] billpg|14 years ago|reply
What's the trade-off by having "undefined" returned instead of having an error reported as soon as the code is loaded?
[+] [-] nostrademons|14 years ago|reply
For core methods like 'length', it seems silly to think that you'd want to redefine it. And indeed, it's usually counterproductive - that's why any experienced JavaScript dev will have coding conventions like "Don't muck with the prototypes of built-in objects."
But at the application layer, this can be really useful. Imagine you're adding a new field to a message deep in the storage system, and then you want to pass that along to a template in the rendered HTML. It's really useful to be able to do this without recompiling & restarting each individual server between the backend and the frontend, and just edit a few template files and have them automatically pick up any changes to backend data formats.
Ditto adding a new database column, if you're using an RDBMS - it's pretty handy to have your model objects instantly reflect the new field, instead of needing to manually add accessors to each of your model classes. Rails and Django are built on this principle.
Also, you have a versioning problem with statically-compiled code in a distributed system. Imagine that you add this new 'lenght' field to a backend message, and add it to the frontend, and they both compile & deploy. Now imagine that a message from an old backend hits a new frontend (it's not possible to upgrade a whole distributed system at once without downtime). What does the new frontend do with it? It needs a piece of data, but the backend had no idea that it had to provide that piece of data. The only thing it can do is return the equivalent of 'undefined'.
In C++/Java code, you usually deal with these by inventing frameworks. Google code, for example, is littered with
checks. If you use a more dynamic language like Python, you can use language mechanisms to represent undefined values or fields that are defined at runtime. If you use a static language, you're stuck mimicking them with hashmaps and null.[+] [-] aardvark179|14 years ago|reply
Now the normal (and optimized) route is to find the method on a’s method table and then call that, but if a doesn't have that method then a second method may be called to allow this to be handled. Once you have that sort of mechanism you can make ORM libraries that dynamically examine a schematic at run time and generate accessor methods only as they are needed, decorators, proxies and many other patterns become wonderfully simple, and there are often many more opportunities for meta-programming at run time.
The downside is of course that it becomes harder to find errors when writing or compiling, but tight integration of your development environment with your runtime can help with this.
[+] [-] joblessjunkie|14 years ago|reply
[+] [-] j_baker|14 years ago|reply
https://github.com/search?langOverride=&language=&q=...
[+] [-] gren|14 years ago|reply
[+] [-] veyron|14 years ago|reply
[+] [-] eik3_de|14 years ago|reply
[+] [-] flexd|14 years ago|reply
[+] [-] azth|14 years ago|reply
[+] [-] wahnfrieden|14 years ago|reply
[+] [-] davidmccann|14 years ago|reply
https://github.com/search?type=Code&language=JavaScript&...
[+] [-] mrchess|14 years ago|reply
[+] [-] gus_massa|14 years ago|reply
[+] [-] andrewcamel|14 years ago|reply
[+] [-] unknown|14 years ago|reply
[deleted]