(no title)
oldandtired | 1 year ago
For many queries, the order in which you specify the joins doesn't really matter. But there are a number of classes where the join order dramatically affects how fast the query can actually run and nothing the query planner does will change this.
I came across this problem around 30 years ago. By accident, I discovered what the problem cause was - the order of the joins. The original query was built and took 30 - 40 minutes to run. I deleted particular joins to see what intermediate results were. In reestablishing the joins, the query time went to down to a couple of seconds.
I was able to establish that the order of joins in this particular case was generating a Cartesian product of the original base records. By judicious reordering of the joins, this Cartesian product was avoided.
If you are aware of this kind of problem, you can solve it faster than any query planner ever could.
sgarland|1 year ago
You can raise the limit at the risk of causing query planning times going up exponentially, or refactor your schema, or, as you did, rewrite it to be more restrictive out of the gate. That way, those join paths will be found first and so will be the best found when the planner gives up.
tacone|1 year ago
Another thing to consider is table fragmentation. Fragmentation > bad row count estimation > bad query plan.
afandian|1 year ago
Combination of two joins, filtering on all tables, and sorting by the LEFT one. Performance was fine, until we hit the scale when it suddenly became unpredictable.
In hindsight, given the variability of the queries and table structure, I don't think any query planner could have done a good job. The natural answer was to denormalize. But the journey to get there was a little unpredictable.