Popular Myths about C++, Part 3

[+] JoshTriplett|11 years ago|reply

> “To understand C++, you must first learn C”

> “C++ is an Object-Oriented Language”

> “For reliable software, you need Garbage Collection”

> “For efficiency, you must write low-level code”

> “C++ is for large, complicated, programs only”

Well, 2.5/5 of those aren't myths. You certainly don't need to write low-level code for efficiency, C++ does a rather poor job of acting like an OO language, and you don't need Garbage Collection, you just need to not manage memory manually, for which solutions other than GCs exist.

You don't have to learn all of C to write C++, but unfortunately you have to learn C to understand other people's C++, because other people will not restrict themselves to the subset of C++ you consider respectable. That's true both for the C bits of C++ and for the obscenely complex corners of C++.

And C++ isn't just for large, complicated programs; it's for small, complicated programs too.

[+] nocman|11 years ago|reply

"And C++ isn't just for large, complicated programs; it's for small, complicated programs too."

hahahahahahaha!!!!!!!!!!!!!!!!!!! -- I was thinking that, but hadn't put it into those exact words yet.

I was thinking something like "C++ is definitely for complicated programs" -- not that they necessarily need to be complicated, but that C++ often unnecessarily complicates them.

I really wanted to like the STL a long time ago, but I gave up eventually. Happily my frustration with C++ led me to investigate other programming languages (not that it was the only one I had used), and I write precious little in it lately -- mostly just to modify code others have written in it.

[+] chisophugis|11 years ago|reply

What Bjarne doesn't mention is the enormous difference in code size between qsort and std::sort. The flexibility of having the compiler generate a sorting routine from std::sort is convenient but enormously redundant in many cases. In LLVM, we have array_pod_sort which is just a thin wrapper around qsort in order to avoid the code bloat of std::sort: http://llvm.org/docs/doxygen/html/namespacellvm.html#ae5788f...

For example, the following generates about 2KB of instructions (and will for basically every new type you want to sort):

#include <algorithm>

struct SomeStruct { int X; };

void foo(SomeStruct *SS, int NSS) { std::sort(SS, SS + NSS, [](SomeStruct LHS, SomeStruct RHS) { return LHS.X > RHS.X; }); }

A qsort equivalent will only emit code for the comparator which is just a handful of instructions.

C++ templates may be type safe and all, but at the end of the day they spew duplicated code just as much as those header-only macro-based C containers and algorithms; really more because it's less painful to write templates (vs. macros) and so you do it more, and there is more stuff in the templates. So even though in general the specialized generated code might be faster in most cases (as Bjarne likes to tout), the overall hit on your code size (and i-cache) can be dreadful. Currently, avoiding this issue in C++ just requires diligence on the part of the coder (some optimizations like LLVM's mergefunc can help, but in general it is a pretty hard problem and compilers are not Sufficiently Smart (TM) yet).

[+] btmorex|11 years ago|reply

As long as std::sort can fit in the instruction cache, who cares? (outside the embedded world obviously) It will always be faster unless you're getting regular instruction cache misses.

[+] gsg|11 years ago|reply

Yep. I remember seeing a presentation by one of the graphics engine shops that indicates they did the same thing. (I'll see if I can dig it up...)

Edit: it was DICE, see http://www.slideshare.net/DICEStudio/executable-bloat-how-it... - skip to page 14.

[+] humanrebar|11 years ago|reply

Your concern is valid, especially since it includes measurements on a real project. But I'm not sure it's a major concern. One can (and you already do) use qsort when binary sizes become a concern.

This is an interesting optimization, but it's not suitable for a beginner-to-intermediate C++ audience, which is who Stroupstrup is addressing. The sensible default is to use std::sort.

If someone can (and it looks like you have) measure a benefit in doing something special, then she should have at it. If an entire project, again with measurements, can prove that std::sort shouldn't be its default sort algorithm, that's fine too.

[+] infogulch|11 years ago|reply

Does C++ not deduplicate templates that work on the same size types? E.g. if I used std::sort on arrays of `int`, `unsigned`, `float`, and half a dozen structs of exactly 4 bytes, I don't see why there would need to be more than one copy of the template in the final binary.

[+] fiatmoney|11 years ago|reply

"C++ is a big language. The size of its definition is very similar to those of C# and Java."

I can't speak with authority to C#, but C++ is a massively larger core language than Java with far more complicated semantics.

[+] magila|11 years ago|reply

This is one thing I wish C++ advocates would stop bringing up. While it's technically true that the page count of the C++ spec is comparable to that of C# and Java, not all pages are created equal. The C++ spec is incredibly dense, it's written in a very terse style and often packs as much information into a single sentence as other specs spend an entire paragraph on. Also, the Java spec in particular is typeset with much larger margins, a larger font, and generally more vertical white space than the C++ spec making it seem relatively bigger than it really is.

[+] pjmlp|11 years ago|reply

I think the biggest problem is that many idioms in C++ can only be understood by those of us that embraced the language since the C++ARM days and understand the design decisions to built the language on top of the C toolchain.

C++11 and C++17 might be quite a pleasure to use when a small team controls all the code, unfortunately most of the real world applications are done in pre-C++98 style.

And then, you still need to learn about each libraries use which version of the standard.

This is why there are so many fundamental types like string duplicated everywhere. We had to rely on third party libraries like Tools.h++ for consistency across compilers and OSs, before the majority reached C++98 compliance.

[+] comex|11 years ago|reply

Bjarne should know better than to directly compare the performance of std::sort and qsort. One is typically printed in full in a header file, while the other is typically compiled separately. If qsort were found in a header file, it could be inlined into identical code to the C++ version, regardless of the fact that there are void pointers lying around everywhere.

There are caveats: the compiler might not choose to inline unless you force it to, and if you do that then you'll end up with duplicate code in the case of multiple calls with the same comparison function, while C++ can automagically merge duplicates (although you probably want to write a wrapper function anyway, and C++ will still waste compile time generating the duplicates if the calls are in different source files). Also, if the sorting function calls a secondary function in multiple textual locations, and that function is significant enough that inlining it would produce wasteful code, the pure inlining-based approach will be insufficient (but I don't think most sorting algorithms do this).

In other words, C++ makes it easier to do this sort of thing. No surprise! It certainly makes it prettier. But when it comes to performance, in practice the above would likely not be a big deal for qsort, so the difference between the two functions is really more a matter of convention regarding the implementation location. Benchmarking the two and explaining only that type safety "makes for excellent inlining and good optimizations" is simply misleading.

[+] blub|11 years ago|reply

Are you sure that the callback can be inlined though? How would that work, the calls through the pointer would be replaced with the function body? I would assume that qsort itself could be, but that wouldn't help much.

[+] Aldo_MX|11 years ago|reply

Does compile time really matter that much? I mean, unless you are building a huge project which is expected to take hours to compile, I don't see much benefit in optimizing compile times.

[+] ridiculous_fish|11 years ago|reply

> I have never seen qsort beat sort

Well, here you go: https://gist.github.com/ridiculousfish/bb511993deba1d148317

    qsort: 674 ms
    std::sort: 1104 ms

qsort only requires one invocation of the comparator to determine the order, while std::sort often requires two. So qsort ought to be faster when comparisons are expensive.

[+] gmfawcett|11 years ago|reply

Running your code, I get:

    qsort: 6112 ms
    std::sort: 4925 ms

Compiled on x86_64 with gcc 4.7.2 at -O3. I'm sure there are many possible reasons for the difference in performance.

[+] frankzinger|11 years ago|reply

With unmodified code and the exact same compilation command, I get:

  qsort: 5818 ms
  std::sort: 4948 ms

Ubuntu 12.04 x86-64; g++ and clang++ give the same results.

EDIT: with clang and libc++, same compilation line otherwise:

  qsort: 5822 ms
  std::sort: 571 ms

[+] alexkcd|11 years ago|reply

Neat! That said, std::sort is a template function, so you can pull the source (e.g., take the one from libc++) and change the comparator to return an int. You will still get all the benefits of inlining and optimizations from lack of type erasure, while performing only one comparison :)

Edit: Actually quicksort only needs a stable boolean comparator (e.g., < or >) to determine order. So the number of invocations to the comparator is the same for both qsort and std::sort. Source: http://en.wikipedia.org/wiki/Quicksort

[+] stinos|11 years ago|reply

Ran it about 10 times and using msvc the sort version is around 2.5 times faster. Maybe you just got lucky once? Or something went wrong with timing?

[+] yongjik|11 years ago|reply

I'm not sure why std::sort should require two comparisons. It's not required to be stable (neither is qsort), so when comparing a and b gives (a >= b), std::sort can just assume (a > b) and the array will be sorted just fine.

[+] pubby|11 years ago|reply

When I run a modified version: http://pastebin.com/raw.php?i=XJup4FsU

    qsort: 10466 ms
    std::sort: 5137 ms

[+] agwa|11 years ago|reply

Under what compiler? On GCC 4.9 they're neck-and-neck::

  qsort: 6727 ms
  std::sort: 6718 ms

(CPU is Core i7-950)

[+] unknown|11 years ago|reply

[deleted]

[+] detrino|11 years ago|reply

C++14 version that sorts random strings: https://gist.github.com/det/57c7f0e377e02ccc696f

[+] AlexeyBrin|11 years ago|reply

Let's try again, with a more C++14 like solution:

     sort(v.begin(),v.end(),[](const auto x, const auto y) { return *x > *y; });

instead of your line 39:

     sort(v.begin(),v.end(),[](const string *x, const string *y) { return *x > *y; });

Some results:

• g++ 4.9.2 with O3 qsort 545ms, sort 7289ms

• clang with O3 and libc++ qsort 551ms, sort 844ms

I've used:

   clang++ -std=c++1y -stdlib=libc++ -O3 test.cpp

and

    g++-4.9.2 -std=c++14 -O3 test.cpp

[+] kgabis|11 years ago|reply

That qsort example is getting tedious. I wonder why restrict keyword is never mentioned when talking about C++'s performance advantages over C - it can offer a huge boost on modern CPU architectures. I know it's supported by all major compilers, but still - it's not a part of C++'s standard.

[+] chuckcode|11 years ago|reply

Completely agree. Restrict keyword can free up the compiler to vectorize a lot of code with sse and avx on intel chips.

[1] http://locklessinc.com/articles/vectorize/ [2] http://stackoverflow.com/questions/1965487/does-the-restrict...

[+] chuckcode|11 years ago|reply

One of the reasons people use "low level code" for performance is because the STL doesn't easily provide control of memory which is critical to performance. Electronic Arts wrote their own version of the STL largely so they could better control memory [1].

I'm not really sure about the rest of the myths. I'm a little confused about how "To understand C++, you must first learn C” is a myth since C++ is a superset of C so you kind of have to learn C.

[1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n227...

[+] ordinary|11 years ago|reply

While the C++ language is indeed a superset of the C language, the paradigms of modern C++ have almost no overlap with those of C. See for example the string qsort vs string std::sort elsewhere in this thread.

The myth in "To understand C++, you must first learn C" is not that C and C++ are unrelated. The myth is that learning C helps you understand C++. The reality is that telling people to learn C first is an excellent way of getting them to write really bad C++ when they switch over to C++11.

[+] darkpore|11 years ago|reply

I guess that depends on your definition of 'easily' - you can provide custom allocator s fairly easily. What's more problematic are things like the memory usage patterns of std::vector and std::string. Once you know how they work you can avoid the pitfalls or use custom alternatives.

[+] kbart|11 years ago|reply

Actually, I've found that it's even undesirable to know C, because programmers, that came from C to C++, usually try to do everything in "C way" that leads to the code is "C with objects". To really learn modern C++, one must "forget" C and start with basic concepts.

[+] jevgeni|11 years ago|reply

There's an example in Part 1, where string concatenation is used as an example. C++ requires "adding" two string objects and C requires manipulating char pointers. And thus, C++ is a better teaching programming language.

Although true, I feel this argument is rather weak: it's true, that when teaching I wouldn't want to start with pointers and malloc's from the get go, but it does not mean C++ is the only alternative.

[+] banachtarski|11 years ago|reply

Not as true any more. Are you familiar with value categories in C++11?

[+] stinos|11 years ago|reply

I used a container version of sort() to avoid being explicit about the iterators

Is that something new in C++14, coulnd't immediately find it on the net? Or is it just a version he wrote himself? The latter makes sense for pretty much all algorithms in <algorithm> which you'd use often on a container, to the point you'd start wondering why the standard doesn't provide them built-in.

[+] jeorgun|11 years ago|reply

The C++17 standard should have them, once concepts become a thing; iirc they're not in it now to avoid issues with choosing between sort(Container, Comparator) and sort(Iterator, Iterator) overloads.

[+] unknown|11 years ago|reply

[deleted]

[+] pavanky|11 years ago|reply

The C greater function is unnecessarily long

  int greater(const void* p, const void* q)
  {
     return *(double *)p - *(double *)q;
  }

would work just as well.

EDIT: DON'T USE THIS, WONT ALWAYS WORK

[+] kbwt|11 years ago|reply

That is technically undefined behavior, because the result of the double subtraction could very well be a number that is not representable within the range of a signed int.

[+] _RPM|11 years ago|reply

What is the point of defining the function with two void* arguments if the cast will just take place anyway?

[+] alayne|11 years ago|reply

Don't do this. It has overflow issues.

[+] jayvanguard|11 years ago|reply

In 1995 C++ it wasn't a good choice. In 2000 it was a poor decision in most cases. In 2005 it was a bad decision in almost every case. In 2010 it was completely indefensible. It is almost 2015, why are we even talking about it? C++ was a mistake. A bad detour on the highway of computing.

[+] pmelendez|11 years ago|reply

>"In 1995 C++ it wasn't a good choice. In 2000 it was a poor decision in most cases. In 2005 it was a bad decision in almost every case. "

Was it? Most console games are written in C++ in that period of time. A very popular desktop office suite is built on C++. The most popular design and photograph edition tool for Windows is written using C++.

So you are suggesting that all those guys who picked C++ for those popular piece of software are a sort of lucky morons?

It is a strong and risky assertion don't you think?

[+] sbmassey|11 years ago|reply

Because the alternatives to C++ are even worse.

There are still a lot of problem areas where GC is unacceptable, or you need precise control over memory, or access to things like Cuda, along with the high order programming constructs you get in C++.

Maybe Rust will take over some day ...

[+] btmorex|11 years ago|reply

You're wrong, but I think you just don't have the right kind of experience to understand why.

[+] adamnemecek|11 years ago|reply

So what's an alternative?

[+] Symmetry|11 years ago|reply

C++ is currently really the only choice if you're making an application that is both large and performance critical. If you're building a AAA video game or a web browser or a robot's navigation system then it probably is the best choice even if it's a painful one.

[+] geofft|11 years ago|reply

C++11 is a very different language, in much the same way Java 8 isn't Java 1, or Visual Basic .NET isn't BASICA.

(It's entirely possible that all six of these languages are terrible, but that would be six separate claims.)

[+] unknown|11 years ago|reply

[deleted]

[+] GFK_of_xmaspast|11 years ago|reply

Because C++11.

110 comments