top | item 35910160

(no title)

boloust | 2 years ago

When using clustered indexes, one tradeoff is that if a non-clustered index isn't covering for a query, it will need to perform B-tree traversals to find rows in the clustered index. This can significantly increase the amount of (at least logical) IO compared to heap tables, where the non-clustered indexes can refer directly to the row id.

Because you can only have a single clustered index, you're effectively paying for efficient range queries on a single clustering key by making all other queries slower.

This tradeoff may or may not be worth it depending on your query patterns. In my experience, you can often get away with adding some subset of columns to a non-clustered index to make it covering, and get efficient range queries without making a copy of the entire dataset.

And even with clustered indexes, as soon as you want a range query that's not supported by your clustered index, you're faced with the exact same choices, except that you have to pay the cost of the extra B-tree traversals.

discuss

srcreigh|2 years ago

Appreciate the thoughtfulness. I believe the branching factor of even wide keyed tables don’t add significant cost to point lookups. At most one or two extra disk pages needing to be read.

Example: 80 bytes keys gives you branching factor of roughly 100. 10M rows and you can pack say 20 rows per page. That’s a 4GB table, give or take. That btree still only has 3 intermediate layers and primary data on a 4th layer. (Calculation is log(10M/20/0.75)/log(100)+1.) The first two layers take up less than a megabyte of ram and are therefore easily cached. So you wind up only needing 2 disk reads for the final two layers. Unless Postgres is caching the entire index for point lookups, it should come out about even.

Can’t find any resource saying that btree height exceeds 5, so I’m thinking it’s at worst 2x the (very small) disk read cost vs Postgres.