C++ containers that save memory and time

[+] chrisaycock|13 years ago|reply

Facebook released their C++ containers last year as well, which they named Folly. An interesting feature they added is that their vector class is "aware" of the memory allocator. Ie, if jemalloc is available, then the vector class will attempt to resize all dynamic memory structures to fit within cache hierarchy sizes.

http://news.ycombinator.com/item?id=4059356

[+] arianvanp|13 years ago|reply

Reading C++ code always gives me a weird feeling something is inherently wrong with the language. Header files containing nearly all the implementation because that's the only place where you can define templates is so odd... (https://code.google.com/p/cpp-btree/source/browse/btree.h)

[+] viraptor|13 years ago|reply

Comments like this one don't help either:

    // Inside a btree method, if we just call swap(), it will choose the
    // btree::swap method, which we don't want. And we can't say ::swap
    // because then MSVC won't pickup any std::swap() implementations. We
    // can't just use std::swap() directly because then we don't get the
    // specialization for types outside the std namespace. So the solution
    // is to have a special swap helper function whose name doesn't
    // collide with other swap functions defined by the btree classes.

[+] shawn-butler|13 years ago|reply

I would suggest if you have this feeling that you don't have a good grasp on metaprogramming. This link might help: http://www.parashift.com/c++-faq-lite/templates-defn-vs-decl...

Also the use of extern keyword for templates which is valid according to the standard is limited by compiler compliance. Kind of a chicken-egg problem, not a language problem.

Finally you might investigate the extern template keyword in c++11: http://en.wikipedia.org/wiki/C++11#Extern_template

I have always wondered whether a grand rethinking of compilers/linkers wasn't in order given the radically different face of computing and computational resources available now as opposed to 30+ years ago.

[+] blablabla123|13 years ago|reply

Thanks to Comeau C++'s standard compliance, you can put the template into the .cc files. ;)

[+] cbsmith|13 years ago|reply

Yes, it is a flawed language to have a declaration and the definition are all in one place together. What other language does this aside from Ruby, Python, Perl, JavaScript, Java, C#, Smalltalk, VisualBasic, PHP, Haskell, Lua...

[+] halayli|13 years ago|reply

Welcome to templates and inline functions.

[+] guilloche|13 years ago|reply

agreed.

[+] leetrout|13 years ago|reply

And this is why you need to know all the algo and data structure "basics" when interviewing at the big G. Because people there are working on these sorts of optimizations in their spare time.

[+] Smerity|13 years ago|reply

Agreed. When I was working on Google App Engine, the Quota team sat next to me. The Quota team are the service that just about every other service at Google calls to ensure a user isn't abusing "The Google". That means that they need to have insanely small response times to ensure that all the services run on time.

They hit an issue no-one really thinks about: resizing hash tables. No-one thinks about it, as it happens transparently under the surface, but for them, the few milliseconds it took to resize the hash tables would have enormously detrimental effects on just about every other Google service. The solutions they were talking about were really quite amazing. That's the insanity they face.

So, if anyone hasn't seen Google Sparsehash[1], check it out: it's another near drop-in improvement to the STL. Two bits per entry of overhead and insanely fast. I can't remember if the Quota team used it specifically, but I wouldn't be surprised if it's one of the results of their work.

[1]: http://code.google.com/p/sparsehash/

[+] eridius|13 years ago|reply

What does knowing this stuff at interview-time have to do with being able to implement it later? I certainly don't remember how b-trees work off the top of my head, but what I do know is how to find out the answer given a computer and an internet connection.

[+] adamnemecek|13 years ago|reply

Or maybe it's self-selection bias.

[+] unknown|13 years ago|reply

[deleted]

[+] shin_lao|13 years ago|reply

I would recommend to have a look at sorted arrays first. There is an implementation available in Boost:

http://www.boost.org/doc/libs/1_52_0/doc/html/container/non_...

[+] cbsmith|13 years ago|reply

Sorted arrays are a common choice that I've seen people go to in lieu of std::set, but a proper B-tree implementation kicks butt compared to a sorted array for performance when mutating data, and actually an unsorted array tends to kick but for small array sizes (<100 elements).

[+] damian2000|13 years ago|reply

Anyone know why the STL doesn't use B trees? Are there use cases where Red-Black trees are better?

[+] gsg|13 years ago|reply

The STL guarantees both iterator and address stability for the elements of a set, map, or the unordered variants. Those guarantees can't be provided by many more compact representations.

Note that this rules out more than just B-trees: open addressed hash tables and judy array like structures have the same issue.

[+] andreasvc|13 years ago|reply

It says in the article that B-trees are advantageous when the cost of cache-misses would be worse than the cost of comparisons; e.g., in the case of integers. In comparison, a Red-Black tree would do less comparisons, and hash-table is faster on average but is unordered (more differences?).

[+] guilloche|13 years ago|reply

Without considering cache-miss, Red-black tree is much faster than B-tree. So this b-tree implementation must be tailored to cache-line size, and it may performs badly on some CPUs.

The performance gain on the B-tree container is also largely depends on the size of value. So choose this one carefully.

[+] SenorWilson|13 years ago|reply

I don't really think it is possible to say one is better than another without a specific application in mind. Probably just chose whatever one they wanted at the time.

[+] unknown|13 years ago|reply

[deleted]

[+] unknown|13 years ago|reply

[deleted]

[+] olivier1664|13 years ago|reply

I'm confuse. From my understanding of B-Tree, B-Tree are good when you load data from hard drive. It allow to minimize the number of time you load a range of data from the hard drive. Since C++B-Tree is in RAM, and thus the data loading should be less important, how does it appears to have a so good benchmark? What is the secret ingredient?

[+] alexgartrell|13 years ago|reply

If I had to guess, I'd say better cache locality. It's tremendously more expensive to go to main memory for something than to go to the cache.

[+] grundprinzip|13 years ago|reply

One question: Is this something similar to the CSB+ Tree (Cache Conscious B+ Tree)

http://people.cs.aau.dk/~simas/dat5_07/papers/p475-rao.pdf

Or are the optimisations mentioned there still valid?

111 comments