- Competitors with Coverity are CodeSonar[1] and Klocwork[2]. I've not seen Klocwork output, but CodeSonar and Coverity are in the same area of quality, with differing strengths. I can not recommend static analysis highly enough if you have a C/C++/Java/C# database. It's very expensive (well into five figures according to Carmack), but how expensive is a bug? What if you have your entire codebase checked daily for bugs? Consider the effect on your quality culture. :-)
- The fact that you are paying "well into five figures" for a tool that essentially covers up design deficiencies in your language should start sounding alarm bells in your head. The proposition more or less goes, "To have reliable C++ code in certain areas, you need a static analyzer; to gain that same advantage in Haskell costs you nothing more than GHC". Of course Haskell doesn't have certain C/C++ capabilities; but it's worth meditating on for your next application, particularly if bugs are more important than performance. N.b- I don't know the ML family enough to say one way or the other in this regard. :-)
The fact that you are paying "well into five figures" says more about the sales process than anything else.
Any time you have dedicated sales staff selling stuff to organizations with a sales process that needs to get buy-in from executives, your prices have to go way up to cover that cost. (Doubly so because the sales process is expensive for you, and your price point lowers your success rate.) Unfortunately many organizations have a rule that every purchase over $X has to be approved by management. Thus there are 3 price points that you can find proprietary software sold at:
1. Small enough that a developer can charge it to a corporate credit card.
2. Relatively modest recurring payments, individually small enough to pass on a corporate credit card.
"To have reliable C++ code in certain areas, you need a static analyzer; to gain that same advantage in Haskell costs you nothing more than GHC"
It is certainly true that many of the static errors that Coverity and friends can find aren't possible in Haskell because it won't compile, but I'd like to see a lot more public experience with Haskell and larger projects to see what the result would be for other problems. A lot of what we now know about how C++ should be written is the result of decades of hard experience, and who knows what we might find in Haskell with as much experience.
On the Java end (all though these tools also do C/C++ and other languages maybe even .NET but their speciality from my understanding is Java) is Fortify [now owned by HP] and IBM AppScan Source [previous Ounce from Ounce Labs].
Both produce high quality output in their domain. Fortify is widely use in the AppSec world.
And if you don't want to jump into Haskell's strangeness right away, there's always OCaml. It's fast enough for finance, so it should be fast enough for you.
(I prefer Haskell, it's much more beautiful. But that's more of a religious issue. OCaml is still way better than, say, C++.)
I wonder if there are languages who have the reliability of Haskell's type system, but are suitable for C/C++ stuff (that probably means, at the very least, they're strictly evaluated). ATS came up as a possible candidate, does anyone have real experience with it?
BTW, I once considered this (this being a language with Haskell's strengths, suitable for systems programming) a possible idea for a thesis, but at least for now it's put on hold. If anyone has any comments, I'd really love to hear them.
So I've dealt with dozens of Fortune-100 companies implementing and using static code analysis tools. They can and will help but in general I feel that these tools are not much more than the code-equivalent of the syntax- and grammar- checker in your word processing software.
I've been doing manual code reviews for a living now (mostly security related) for roughly 3 years now and while I get assisted from time to time by code analysis tools I still find heaps of bugs not caught by any of the tools mentioned by Carmack. The biggest issue for a development shop is to properly integrate these tools and to not overwhelm developers with too much false positives.
I've had cases where a developer got a 1500 page PDF spit out by one of these static analysis tools. After spending two weeks going through everything the developer ended up with 50 pages of actual bugs; the rest were describing false positives. Then I got on-site and I still logged dozens and dozens of security-related bugs that the static analysis tools failed to find.
Edit: also consider that one even needs a SAT solver to even do proper C-style preprocessor dependency checking. A lot of these code analysis tools are being run on debug builds only and then there when the release build is being made these tools are not being run meaning they fail to catch a lot of issues. It's insanely hard to write proper code analysis tools and static source code analysis tools which do not integrate with the compilation process I wouldn't trust at all.
Nowadays with clang there are very nice possibilities for someone to write your own simple checks and integrate them into the build process. But even clang doesn't expose everything about the preprocessor that you might want to have from a static code analysis perspective.
A couple of years ago I ran across an article from Coverity about the challenges that they had getting people to use their tool. One of the most interesting was that the tool would find real bugs, but the developer wouldn't understand that the bug actually was real, and would say, "This tool is crap." Then they would not get a sale.
They had this problem particularly strongly with race conditions. There are a number of checks that they took out because, even though they were finding real bugs, developers were convincing themselves that the bugs were not real even though they really were.
It really does not help that the developers who are asked to evaluate are likely to be the same people who made the mistakes in the first place, so all kinds of defensive behavior are to be expected.
> We had a period where one of the projects accidentally got the static analysis option turned off for a few months, and when I noticed and re-enabled it, there were piles of new errors that had been introduced in the interim. Similarly, programmers working just on the PC or PS3 would check in faulty code and not realize it until they got a “broken 360 build” email report. These were demonstrations that the normal development operations were continuously producing these classes of errors, and /analyze was effectively shielding us from a lot of them.
Something which corroborates this: When penetration testers break into systems, they're often using new 0-day exploits. Think about that. Most of today's software development practice produces such a steady stream of low-level bugs, that penetration testers can assume that they're there!
> Trying to retrofit a substantial codebase to be clean at maximum levels in PC-Lint is probably futile. I did some “green field” programming where I slavishly made every picky lint comment go away, but it is more of an adjustment than most experienced C/C++ programmers are going to want to make. I still need to spend some time trying to determine the right set of warnings to enable to let us get the most benefit from PC-Lint.
This could be encouraged using game dynamics. Have a mechanism where a programmer can mark parts of the codebase "green-field." A programmer's "green-field score" consists of the number of lines of green-field code (or statements, whichever lousy metric you want) that he's successfully compiled with no warnings whatsoever. Combine this with random sampling code walkthroughs, which has many benefits but will also catch boilerplate, auto-generated, or copy-paste programming by a "Wally" who's trying to "write himself a new minivan."
Having used PC-Lint almost all the way back to it's origins, I can testify to just how scary it is to run this on your code. Code you wrote as well as code written by teammates. In self defense, you HAVE to spend time tuning the system in terms of warnings and errors---otherwise you drown in a sea of depressing information. I liked John's comment about attempting 'green field' coding. It is a tremendously valuable process given the time. Great article, definite thumbs up.
I've been wanting something like this for Ruby for some time now. Since it's dynamically typed and ridiculously easy to monkey-patch, Ruby is a much harder challenge than C++. The two best efforts I
have found are Diamondback Ruby (http://www.cs.umd.edu/projects/PL/druby/) and Laser (http://carboni.ca/projects/p/laser)...but they mostly try to add static type-checking to Ruby code. After
looking at these I implemented a contracts library for Ruby (https://github.com/egonSchiele/contracts.ruby) to get myself some better dynamic checking. The next step is to use the annotations
for the contracts library to do better static code analysis. One thing I'm working on is generating tests automatically based on the contract annotations. But I've got a long way to go : ( If anyone
knows about other projects that are working on static analysis for Ruby I'd be very interested in hearing about them!
Looked at your Contracts library for Ruby. I was just thinking yesterday about starting a project like this! Some thoughts:
- I like that the contracts are right there in the code and not in a separate file.
- The actual Contract methods only run once upon loading each class, so they don't cause a performance hit. I'd like a way to make them all noops (so they don't add any instrumentation) when running in "production".
- My thought is that if you can define pre- and post-conditions as you're doing, then you can cut out a lot of the checks in your tests, so that as long as the tests exercise the code, you can rely on the post-conditions to do quite a bit of the validation. Plus you're validating outputs all over the place constantly, so you'd probably catch edge cases you never even expected.
- I'm disappointed that your system seems to offer little more than type checking (unless I'm mistaken). I'd like something that gives me true pre- and post-conditions: basically passing a block of Ruby code for each one, with common idioms abbreviated so I don't need to write a whole block if the conditions are simple.
The article mirrors my recent experience 100%. We've got a Coverity license and I've started using it recently. Luckily, our code base is relatively small, it's straight C and embedded (no mallocs, no OS). Even in this extremely simple environment it's shocking how many errors Coverity can ferret out.
The false-positives are a problem and the general advice to get started is to initially ignore all existing bugs and focus on avoiding adding new bugs. Then, when you get the hang of writing code that passes the checks you go back and look for the worst of the older bugs, etc.
This actually makes a lot of sense. I mean, Microsoft is a gatekeeper on the Xbox 360 and enforces quite a bit of forced QA on every game officially released for the platform. If an Xbox 360 game crashes, I view it as a Microsoft failing, even if the game isn't developed by them. OTOH, if a non-Microsoft Windows app crashes, I don't blame Microsoft at all but rather the app developer.
Carmack is no doubt familiar with Microsoft's game cert process, it probably wasn't in jest at all.
It does impact Microsoft more, because Microsoft has taken on responsibility for games to be good (well, meet certain baselines for various definitions of good) through their TCRs.
FindBugs [1] is a great code analysis tool for Java. It's free, open source, and supports plugins for writing your own checks. The FindBugs site reports an interesting story from a Google test day:
"Google held a global "fixit" day using UMD's FindBugs static analysis tool for finding coding mistakes in Java software. More than 700 engineers ran FindBugs from dozens of offices.
Engineers have already submitted changes that made more than 1,100 of the 3,800 issues go away. Engineers filed more than 1,700 bug reports, of which 600 have already been marked as fixed. Work continues on addressing the issues raised by the fixit, and on supporting the integration of FindBugs into the software development process at Google."
For Java code, I am a big fan of Sonar Source [1]. It is open source, is able to leverage checkers such as FindBugs, and has integration with code coverage tools such as Cobertura. I have found the clean dashboard to be a great boon, and have never felt intimated by the reported warnings like I've been with Coverity et al.
The error conditions that cause bugs in concurrent code are difficult to replicate since concurrent code can have so many additional states of operation. It's also harder to program, especially in languages / environments that don't explicitly encourage it right from the beginning.
Concurrency is complex. You have to sync between threads/processes/actors, and the sequence of a function doesn't include everything happening to the data in use. That makes it harder to predict and debug the program.
Your tool looks pretty cool and useful, but it seems quite a bit different than tools like Coverity and Astree which reason about the behavior of programs under all possible inputs. Coverity can detect race conditions. Your tool looks like it's using technology closer to that of Refactoring Crawler, which was a quite impressive achievement, but not a static analysis tool.
the `Controler' part of my main codebase consists of interwoven PHP and MySQL. Is there static analysis tool that understands both, one in relation to the other?
The closest thing I'm aware of is a tool by the University of Tartu which does this for Java, described in the paper "An Interactive Tool for Analyzing Embedded SQL Quries." It can find the SQL errors (misspelled "name" and missing space before "AND") in the following code:
public Statement ageFilter(int minimalAge) throws SQLException {
Connection conn = getConnection();
String sql = "SELECT name FROM people WHERE nmae IS NOT NULL";
if (minimalAge > 0)
sql += "AND (age >= " + minimalAge + ")";
return conn.prepareStatement(sql);
}
There are a lot of really interesting challenges in doing this. Their tool uses a powerful technique called "abstract parsing" which can analyze all possible parse trees created by running the SQL parser on a string produced by Java string operations. It's pretty impressive what modern static analysis can do, and just how weak the linters we've gotten used to are in comparison.
Probably not, but you might want to give phpcpd, pdepend and pmd a try. They're all open source static analysis tools for PHP, and will find a number of potential issues.
I also find getting a good set of rules for phpcs is extremely helpful in PHP. You can use its sniffs to catch past mistakes and enforce consistent coding standards.
Good question, there aren't exactly any good tools that interweave several languages into the same environment. Not just for php-mysql but also for php-html or html-javascript or php-javascript. Phpstorm gives you intellisense for sql-strings embedded in your php code but that's the most advanced cross-language-feature i've seen, except for LINQ in C#.
Imagine a tool that could analyze this:
//sql table T having columns int x, string y
//php
$data = sql_fetch_from_column(T)
//javascript
var d = <?=json_encode($data)?>
for(a in d)
var b = a.x / a.y; //Type error, int division by string not defined
For C/C++, also try just compiling with clang. It has great diagnostics. Also it has the static analyer whose C++ support just improved greatly in trunk.
Funny timing, I just got jslint turned back on in our build today! (well, jsHint now due to the 'for(var i=0...' failing even with -vars enabled, but I digress...).
Another dev and I spent literally the entire day fixing issues - and we had jslint running on every checkin until a few months ago!
But, it was worth it. It feels great to know that those bugs won't happen again without a failing build :)
>due to the 'for(var i=0...' failing even with -vars enabled
That's because `for(var i=0...` is completely bogus. What if you have two of those loops? Are you going to put `var` in the first one, but not in the second one? Are you going to use 'i' in the first one and 'k' in the second one?
What if you need some refactoring and those two loops switch places?
It's fine if you like Java, but JavaScript is different. When in Rome, do as the Romans do. I also wish there would be block scope, but merely pretending there is doesn't make it so.
It's better if you declare all of your variables at the very top of the innermost function, because that's where they effectively are (thanks to hoisting). It's better if your code reflects how it's actually working, because that will always make it easier to comprehend.
Once there is `let`, you should switch to Java's "declare on first use" rule and never use `var` again.
[+] [-] pnathan|13 years ago|reply
- Competitors with Coverity are CodeSonar[1] and Klocwork[2]. I've not seen Klocwork output, but CodeSonar and Coverity are in the same area of quality, with differing strengths. I can not recommend static analysis highly enough if you have a C/C++/Java/C# database. It's very expensive (well into five figures according to Carmack), but how expensive is a bug? What if you have your entire codebase checked daily for bugs? Consider the effect on your quality culture. :-)
- The fact that you are paying "well into five figures" for a tool that essentially covers up design deficiencies in your language should start sounding alarm bells in your head. The proposition more or less goes, "To have reliable C++ code in certain areas, you need a static analyzer; to gain that same advantage in Haskell costs you nothing more than GHC". Of course Haskell doesn't have certain C/C++ capabilities; but it's worth meditating on for your next application, particularly if bugs are more important than performance. N.b- I don't know the ML family enough to say one way or the other in this regard. :-)
[1] http://www.grammatech.com
[2] http://www.klocwork.com
[+] [-] btilly|13 years ago|reply
Any time you have dedicated sales staff selling stuff to organizations with a sales process that needs to get buy-in from executives, your prices have to go way up to cover that cost. (Doubly so because the sales process is expensive for you, and your price point lowers your success rate.) Unfortunately many organizations have a rule that every purchase over $X has to be approved by management. Thus there are 3 price points that you can find proprietary software sold at:
1. Small enough that a developer can charge it to a corporate credit card.
2. Relatively modest recurring payments, individually small enough to pass on a corporate credit card.
3. Starting well into 5 figures.
[+] [-] jerf|13 years ago|reply
It is certainly true that many of the static errors that Coverity and friends can find aren't possible in Haskell because it won't compile, but I'd like to see a lot more public experience with Haskell and larger projects to see what the result would be for other problems. A lot of what we now know about how C++ should be written is the result of decades of hard experience, and who knows what we might find in Haskell with as much experience.
[+] [-] stiff|13 years ago|reply
http://www.altdevblogaday.com/2012/04/26/functional-programm...
[+] [-] timtadh|13 years ago|reply
Both produce high quality output in their domain. Fortify is widely use in the AppSec world.
[+] [-] eru|13 years ago|reply
(I prefer Haskell, it's much more beautiful. But that's more of a religious issue. OCaml is still way better than, say, C++.)
[+] [-] Nimi|13 years ago|reply
BTW, I once considered this (this being a language with Haskell's strengths, suitable for systems programming) a possible idea for a thesis, but at least for now it's put on hold. If anyone has any comments, I'd really love to hear them.
[+] [-] stonemetal|13 years ago|reply
[+] [-] santaragolabs|13 years ago|reply
I've been doing manual code reviews for a living now (mostly security related) for roughly 3 years now and while I get assisted from time to time by code analysis tools I still find heaps of bugs not caught by any of the tools mentioned by Carmack. The biggest issue for a development shop is to properly integrate these tools and to not overwhelm developers with too much false positives.
I've had cases where a developer got a 1500 page PDF spit out by one of these static analysis tools. After spending two weeks going through everything the developer ended up with 50 pages of actual bugs; the rest were describing false positives. Then I got on-site and I still logged dozens and dozens of security-related bugs that the static analysis tools failed to find.
Edit: also consider that one even needs a SAT solver to even do proper C-style preprocessor dependency checking. A lot of these code analysis tools are being run on debug builds only and then there when the release build is being made these tools are not being run meaning they fail to catch a lot of issues. It's insanely hard to write proper code analysis tools and static source code analysis tools which do not integrate with the compilation process I wouldn't trust at all.
Nowadays with clang there are very nice possibilities for someone to write your own simple checks and integrate them into the build process. But even clang doesn't expose everything about the preprocessor that you might want to have from a static code analysis perspective.
[+] [-] btilly|13 years ago|reply
A couple of years ago I ran across an article from Coverity about the challenges that they had getting people to use their tool. One of the most interesting was that the tool would find real bugs, but the developer wouldn't understand that the bug actually was real, and would say, "This tool is crap." Then they would not get a sale.
They had this problem particularly strongly with race conditions. There are a number of checks that they took out because, even though they were finding real bugs, developers were convincing themselves that the bugs were not real even though they really were.
It really does not help that the developers who are asked to evaluate are likely to be the same people who made the mistakes in the first place, so all kinds of defensive behavior are to be expected.
[+] [-] ScottBurson|13 years ago|reply
(Full disclosure: I work for HP Fortify.)
[+] [-] stcredzero|13 years ago|reply
Something which corroborates this: When penetration testers break into systems, they're often using new 0-day exploits. Think about that. Most of today's software development practice produces such a steady stream of low-level bugs, that penetration testers can assume that they're there!
> Trying to retrofit a substantial codebase to be clean at maximum levels in PC-Lint is probably futile. I did some “green field” programming where I slavishly made every picky lint comment go away, but it is more of an adjustment than most experienced C/C++ programmers are going to want to make. I still need to spend some time trying to determine the right set of warnings to enable to let us get the most benefit from PC-Lint.
This could be encouraged using game dynamics. Have a mechanism where a programmer can mark parts of the codebase "green-field." A programmer's "green-field score" consists of the number of lines of green-field code (or statements, whichever lousy metric you want) that he's successfully compiled with no warnings whatsoever. Combine this with random sampling code walkthroughs, which has many benefits but will also catch boilerplate, auto-generated, or copy-paste programming by a "Wally" who's trying to "write himself a new minivan."
[+] [-] hsmyers|13 years ago|reply
[+] [-] js2|13 years ago|reply
[+] [-] egonschiele|13 years ago|reply
[+] [-] pjungwir|13 years ago|reply
- I like that the contracts are right there in the code and not in a separate file.
- The actual Contract methods only run once upon loading each class, so they don't cause a performance hit. I'd like a way to make them all noops (so they don't add any instrumentation) when running in "production".
- My thought is that if you can define pre- and post-conditions as you're doing, then you can cut out a lot of the checks in your tests, so that as long as the tests exercise the code, you can rely on the post-conditions to do quite a bit of the validation. Plus you're validating outputs all over the place constantly, so you'd probably catch edge cases you never even expected.
- I'm disappointed that your system seems to offer little more than type checking (unless I'm mistaken). I'd like something that gives me true pre- and post-conditions: basically passing a block of Ruby code for each one, with common idioms abbreviated so I don't need to write a whole block if the conditions are simple.
Good luck on the project!
[+] [-] raesene2|13 years ago|reply
[+] [-] zwieback|13 years ago|reply
The false-positives are a problem and the general advice to get started is to initially ignore all existing bugs and focus on avoiding adding new bugs. Then, when you get the hang of writing code that passes the checks you go back and look for the worst of the older bugs, etc.
[+] [-] gaius|13 years ago|reply
Many a true word spoken in jest!
[+] [-] georgemcbay|13 years ago|reply
[+] [-] cbs|13 years ago|reply
It does impact Microsoft more, because Microsoft has taken on responsibility for games to be good (well, meet certain baselines for various definitions of good) through their TCRs.
[+] [-] estebank|13 years ago|reply
If you haven't read it, do so.
You can read further discussion on this 270 days old article at http://news.ycombinator.com/item?id=3388290
[+] [-] cpeterso|13 years ago|reply
"Google held a global "fixit" day using UMD's FindBugs static analysis tool for finding coding mistakes in Java software. More than 700 engineers ran FindBugs from dozens of offices.
Engineers have already submitted changes that made more than 1,100 of the 3,800 issues go away. Engineers filed more than 1,700 bug reports, of which 600 have already been marked as fixed. Work continues on addressing the issues raised by the fixit, and on supporting the integration of FindBugs into the software development process at Google."
[1] http://findbugs.sourceforge.net/
[+] [-] jadc|13 years ago|reply
[1] http://www.sonarsource.org
[+] [-] quaunaut|13 years ago|reply
> If you aren’t deeply frightened about all the additional issues raised by concurrency, you aren’t thinking about it hard enough.
Why exactly is that?
[+] [-] amackera|13 years ago|reply
[+] [-] pfraze|13 years ago|reply
[+] [-] jaimefjorge|13 years ago|reply
Qamine integrates directly with github and is designed to be used by small and medium companies that cannot afford those expensive tools.
[+] [-] Darmani|13 years ago|reply
[+] [-] dexen|13 years ago|reply
the `Controler' part of my main codebase consists of interwoven PHP and MySQL. Is there static analysis tool that understands both, one in relation to the other?
[+] [-] Darmani|13 years ago|reply
[+] [-] mark_story|13 years ago|reply
I also find getting a good set of rules for phpcs is extremely helpful in PHP. You can use its sniffs to catch past mistakes and enforce consistent coding standards.
[+] [-] Too|13 years ago|reply
Imagine a tool that could analyze this:
[+] [-] mrich|13 years ago|reply
[+] [-] nfriedly|13 years ago|reply
Another dev and I spent literally the entire day fixing issues - and we had jslint running on every checkin until a few months ago!
But, it was worth it. It feels great to know that those bugs won't happen again without a failing build :)
[+] [-] ahoge|13 years ago|reply
That's because `for(var i=0...` is completely bogus. What if you have two of those loops? Are you going to put `var` in the first one, but not in the second one? Are you going to use 'i' in the first one and 'k' in the second one?
What if you need some refactoring and those two loops switch places?
It's fine if you like Java, but JavaScript is different. When in Rome, do as the Romans do. I also wish there would be block scope, but merely pretending there is doesn't make it so.
It's better if you declare all of your variables at the very top of the innermost function, because that's where they effectively are (thanks to hoisting). It's better if your code reflects how it's actually working, because that will always make it easier to comprehend.
Once there is `let`, you should switch to Java's "declare on first use" rule and never use `var` again.
[+] [-] rabidsnail|13 years ago|reply
[+] [-] eru|13 years ago|reply
And then fire that person. (Or rather, see if big commits introduce more bugs than short ones, or whatever else you might find.)
[+] [-] Flow|13 years ago|reply
[+] [-] calpaterson|13 years ago|reply
[+] [-] unknown|13 years ago|reply
[deleted]
[+] [-] Havoc|13 years ago|reply
[+] [-] samspot|13 years ago|reply
[+] [-] seanalltogether|13 years ago|reply
[+] [-] glaze|13 years ago|reply