top | item 39126691

(no title)

jemfinch | 2 years ago

This really doesn't seem to be comparing to comparable data structures. For int map specializations like this, the optimized alternatives are things like Judy (which is looking quite aged these days) or roaring bitmaps, not to mention that any C++ developer using "ordinary" maps will be using absl's SwissTable (flat_hash_map) or folly's F14 (F14FastMap) or perhaps absl::btree_map if order is important. Comparisons to std::map and std::unordered_map are simply too naive to make the case for this data structure.

discuss

order

ww520|2 years ago

Oh come on. Don't be too harsh. This is an ordered map. Most of the mentioned ones are unordered maps. They might be fast but they are unordered. The only comparables are absl::btree_map and Judy Array. Without benchmarking, my gut feeling is this will beat absl::btree_map. Trie usually beats BTree.

jemfinch|2 years ago

I've written a lot of high performance/scale C++ code using a lot of data structures over the years, and ordered iteration has been very rarely needed; unordered data structures still rule the day in performance the vast majority of the time, and their lower constant factors very frequently outperform more specialized data structures. They're absolutely worth benchmarking against if the goal is actual uptake in the actual world.

In my experience, practically every single time I've used absl::btree_map, I've ended up reverting for performance reasons to either a flat hash map or, in some relatively rare cases, a sorted vector map (despite its O(n) insert/erase) because the constant factors are _so doggone low_. The experience remains: btree_map (or SkipLists, or whatever) has (in my experience) essentially never, in over a half million lines of C++, actually remained in the code.

Also, I presume (based on the implementation details, not based on actual use) that roaring bitmaps have some reasonable iteration API that would make them relevant even to the ordered comparison.

The important thing here is that if anyone wants to contend that someone should use their new data structure because it's better or faster or more optimal for some particular use case, it's important for them to demonstrate that they've thoroughly investigated the data structure space around their proposal. Comparison against std::map and std::unordered_map simply doesn't demonstrate the kind of comprehensive knowledge that I would expect from someone who claims that I should use their optimized integer map in my own code.