top | item 13638109

Testing GCC

84 points| dmalcolm | 9 years ago |developers.redhat.com

39 comments

order
[+] boris|9 years ago|reply
Interesting to see complex pieces of software including features that are primarily used for testing. I guess this is a natural next step after having testing micro-formats[1].

One thing I don't like about GCC's testing framework is the number of files. If you need to test 10 related things (say, diagnostics for a particular language construct), then you will have 10 different test files. I would much prefer for them to be in a single file. This is one of the goals of Testscript[2].

[1] https://blog.nelhage.com/2016/12/how-i-test/

[2] https://build2.org/build2/doc/build2-testscript-manual.xhtml

[+] dmalcolm|9 years ago|reply
[author of the post]

Thanks for the links; a lot of interesting material in there.

FWIW, DejaGnu gives some flexibility over how granular the test files can be. For example, here's part of the test case for the -Wmisleading-indentation warning I added in gcc 6: https://github.com/gcc-mirror/gcc/blob/master/gcc/testsuite/...

i.e. that's one 700 line test source file, expressing most of the test coverage for one feature.

I realize now that the post should have had an example of what such a test case has; see the "dg-warning" directives in that file.

So that's one "single file" test case. That said, I had to put some of the other test cases for that feature into different test files: some of them interact with the preprocessor in "interesting" ways, so it was simplest to split them out.

However, when a test starts failing, it's handy if the failing test is small (especially when stepping through the latter in the debugger...).

So in our testing we have a mix of both big test files covering a lot of material, and small test files, covering very specific corner cases.

Another aspect of granularity is how long the testsuite takes to run: each test source file in the DejaGnu suite involves exec-ing the compiler, which adds a fixed amount of overhead. It parallelizes well, but it's still advantageous to stuff more testing into fewer files (with the obvious tradeoff w.r.t. debugability if something starts failing). This was another reason for adding the unit-testing suite: this part is all done in-process, and so these ~30,000 tests run in less than a second.

[+] saurik|9 years ago|reply
> So, in GCC 7, we’ve extended the C frontend so that we can embed fragments of GIMPLE and RTL dumps as the bodies of functions.

This additionally sounds extremely interesting for reasons that have nothing to do with testing (though, I bet most of the things I'd want to do this with this will continue to require the expanded platform support for __attribute__((__naked__)) that has blocked a lot of my use cases for inline assembly and which does not seem to be something that is "wanted" by GCC, though I should verify it isn't just due to no one providing a patch... it isn't as if there aren't other people asking for it).

[+] namelezz|9 years ago|reply
> As a relative newcomer to the project, one of my “pain points” learning GCC’s internals was the custom garbage collector it uses to manage memory

It's interesting that GCC has a custom GC.

[+] dmalcolm|9 years ago|reply
[author of the post here]

Yes: memory management in the compiler is interesting. There's a complicated graph of pointer references. Most of the time the compiler is building something relatively small, so we don't need to bother cleaning up, we just exit without freeing it all (for speed). But when e.g. building with Link Time Optimization, we can use large amounts of RAM, so a garbage-collection can be needed.

[+] greglindahl|9 years ago|reply
I'm be surprised if it didn't have something related to memory allocation and free. Open64 (previously the SGI compiler) has an "arena" memory allocator to improve memory locality and make freeing scratch memory easier. Looks like LLVM uses arena allocation, too.
[+] raverbashing|9 years ago|reply
Well, besides the big testing of compiling an entire linux distribution, yes, it is good to have specific tests for some particular issues in the compiler (manageable pass/fail tests that can identify a problem quickly)
[+] greglindahl|9 years ago|reply
gcc has always had a huge unit test suite. Not a new thing.
[+] hermitdev|9 years ago|reply
Yeah...I got hit by a regression in a bug fix in RH's GCC back in the 4.1.2 era. The revision/build number only moved by a dozen or 2, but they completely broke C++ anonymous namespaces at global scope (caused an ICE). Glad I didn't let Ops upgrade all of my boxes at the same time like they wanted...
[+] wolf550e|9 years ago|reply
So is GCC adding a toolchain similar to the one LLVM has around its bitcode?

In this instance, a way to write a text representation of an internal data structure that represents code, run a single optimization pass over that data structure, dump the result, compare the result with serialized text representation of "expected value"?

Will they add the other parts, to extract an llvm-like core out of gcc and implement the C and C++ compilers and link time optimizer as users of that core?

[+] ori_b|9 years ago|reply
> So is GCC adding a toolchain similar to the one LLVM has around its bitcode?

Not as far as I can tell.

> In this instance, a way to write a text representation of an internal data structure that represents code, run a single optimization pass over that data structure, dump the result, compare the result with serialized text representation of "expected value"?

That's already existed for ages. See the -fdump-tree-* family of functions. It also doesn't do anything llvm-like.

[+] drfuchs|9 years ago|reply
Great, but now how will you find out that a particular optimization pass hasn't been completely obviated by earlier passes, since you won't be looking for C code that actually triggers it?
[+] greglindahl|9 years ago|reply
I never got around to actually implementing the following in the PathScale compiler, but my idea was to add logging in all the places where allocations do something, and then create lists of optimizations that I expect to have successfully fired in various functions or programs. This would be especially handy for checking that new optimizations didn't disable important things for various SPECcpu benchmarks.
[+] correnos|9 years ago|reply
Well, you could embed a machine-checked proof that instruction sequences of the targeted form will sometimes survive previous optimization passes and therefore the given optimization will always be contributing something.

But that's hard, so every compiler I've used leaves it to human engineering instead.

[+] bonzini|9 years ago|reply
You can do both kinds of tests: check that a pass still optimizes some GIMPLE or RTL input the same way, and check that the C code is optimized the same way. In turn, the latter can either look at per-pass dumps, or at the final assembly code for one or more architectures.
[+] pizlonator|9 years ago|reply
I'm super impressed by the size of their test suite. I wish I had a test suite like that!
[+] monochromatic|9 years ago|reply
Took forever to load, then gave me "Page Does Not Exist." Anybody else having issues?
[+] smhenderson|9 years ago|reply
I am getting the same. Tried about 15 minutes ago and then again just now. Still showing the Page Does Not Exist error.
[+] dmalcolm|9 years ago|reply
[author of the post here]

Thanks; am chasing it up at my end

[+] pirocks|9 years ago|reply
I too am having issues.