top | item 39315453

(no title)

davidrowley | 2 years ago

It's a bit complex to explain here, but I describe an idea I've been considering in https://www.postgresql.org/message-id/CAApHDvo2sMPF9m=i+YPPU...

discuss

order

Sesse__|2 years ago

Yes, a lot of this seems interesting to go into. I really hope that at some point, someone would find the resources to just try a lot of dumb stuff and see what works in practice. I mean, what we have right now (multiply selectivities together as if they were independent) is also pretty dumb, and there's no good reason why it should be preferred over everything else.

Another avenue is of course trying to avoid the issue to begin with, e.g. through the recent “translation grids” of Müller and Moerkotte for better join selectivities. But I doubt anyone is going to be finding a silver bullet for this anytime soon, so reducing plan risk somehow seems very worthwhile.

davidrowley|2 years ago

> I mean, what we have right now (multiply selectivities together as if they were independent) is also pretty dumb

Yeah, I think it was probably a mistake to always assume there's zero correlation between columns, but what value is better to use as a default? At least extended statistics allows the correlations of multiple columns to be gathered now. That probably means we'd be less likely to reconsider changing the default assumption of zero correlation when multiplying selectivities.