top | item 39314060

(no title)

davidrowley | 2 years ago

I've considered things like this before but not had time to take it much beyond that. The idea was that the planner could run with all expensive optimisations disabled on first pass, then re-run if the estimated total cost of the plan was above some threshold with more expensive optimisations enabled. It does seem pretty silly to worry about producing a plan in a millisecond for say, an OLAP query that's going to take 6 hours to complete. On the other hand, we don't want to slow down the planner too much for a query that executes in 0.1 milliseconds.

There'd be a few hurdles to get over before we could get such a feature. The planner currently has a habit of making changes to the parsed query, so we'd either need to not do that, or make a copy of it before modifying it. The former would be best.

discuss

order

TylerE|2 years ago

I don't think it's sop muhch a matter of "expensive" as "unknowable".

For instance, if you have a left join (and let's say it can't use an index for whatever reason) - the optimal plan will probably be different if almost every row has a matching row(s) in the join table than if only a few do.

davidrowley|2 years ago

I think the join search would remain at the same level of exhaustiveness for all levels of optimisation. I imagined we'd maybe want to disable optimisations that apply more rarely or are most expensive to discover when the planner "optimisation level" was set to lower settings. I suppose that would be things like LEFT JOIN removals and self-join removals. However, PostgreSQL does not have very many expensive optimisations that rarely help, so having an optimisation level might be more of a way of introducing more new optimisations that help fewer queries. Because we don't have a plan cache, there's been a focus on keeping the planner lean and having it not go to too much effort to optimise queries that are poorly written, for example.