top | item 7423072

(no title)

lzhou | 12 years ago

So the old - allocate a massive array, then use 2n+1, 2n+2 for the left/right children wouldn't have flown with you huh?

discuss

greiskul|12 years ago

This is very common for binary heaps, but for binary trees that can be unbalanced, this won't fly. The space usage is exponential if you insert sequential items into the tree (Since you allocate the space for the pointers to every potential node in every level of the tree).

bratfarrar|12 years ago

People will occasionally try that, but my point to them is that they should be writing a general purpose BST, which should also allow degenerate trees. So no, I do try to steer people away from theoretically correct but desperately inefficient approaches.

dalke|12 years ago

To be fair, it's "desperately inefficient" only in the face of unbalanced trees, no? Which mostly only happens for unrealistic scenarios.

As stated, the problem assumes that time lookup isn't important, even though someone who wants a BST almost certainly chose it to get faster than linear lookup along with fast insert/delete operations and ordered searches. (Otherwise, why not use a hash table?) A self-balancing tree gives that additional guarantee, and in that case an array-based data storage won't be "desperately inefficient".

Indeed I suspect an array will be competitive or even better than an object based version, where each object has its own pointer overhead and associated memory fragmentation.

At the very least, the early AVL work in the 1960s used a tape for storage, and I suspect they didn't waste all that much tape space.

So someone's intuition from real-world use of BSTs might end up giving you a false negative.

lzhou|12 years ago

That's when the flustered interviewee can either fight (patch his solution) or flight (start all over using a class with refs). The most obvious patch is to use a large hashmap instead of a massive array. Still inefficient, but no longer that desperate (and actually has some added benefits).