top | item 4311280

Binary search is a pathological case for caches

80 points| kinetik | 13 years ago |pvk.ca | reply

13 comments

order
[+] koverstreet|13 years ago|reply
...This is news? You've got a tight loop with a branch, and depending on which way the branch goes, you're going to touch completely different locations in memory.

The solution (at least for some problems) is just to lay your data out in memory better - optimal is a binary search tree in an array, like the way heaps are implemented.

That's what I did for bcache: http://evilpiepirate.org/git/linux-bcache.git/tree/drivers/m...

[+] dmpk2k|13 years ago|reply
Good lord. You didn't read the article _at all_.

I have some really ugly things to say to a new poster who cannot be bothered to read an article before trying to show off, but I'll refrain. In the future, read first, think, then comment. You'll be doing everyone else a favor.

[+] steerb|13 years ago|reply
Using the heap layout, you can sometimes even eliminate branch misspredictions and subsequent (expensive) pipeline flushes.

The basic idea is to allow your compiler to use predicated instructions. Do do this, you have to transform control dependencies (e.g., `if (a > b)`) to data dependencies on index computations (e.g., `j = 2*j + (a > b)`).

For a complete explanation, see http://algo2.iti.kit.edu/sanders/papers/ssss.ps.gz

[+] keenerd|13 years ago|reply
> Optimal is a binary search tree in an array, like the way heaps are implemented.

Not quite. The heap layout has bad spatial locality. Think about the last row of leaves - it occupies the last 50% of the array. The next-to-last-row (all their parents) is in the middle 25%-50%. All those leaves are far away from their parents, and thus the heap is not optimal.

Better is to cluster parents and children into small groups. With clusters of three, two thirds of the time a child will adjacent to its parent.