top | item 21464272

Postgres Health Check and SQL Performance Analysis

169 points| torvald | 6 years ago |gitlab.com | reply

18 comments

order
[+] craigg|6 years ago|reply
As an Engineering Manager for the recently formed Database team here at GitLab, this tool has proven to be incredibly valuable. We floated the idea of running this on a regular basis back in April https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues... and now we have daily reports that we review https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues...

samokhvalov helped us to get this set up

[+] sciurus|6 years ago|reply
That's a huge wall of text to digest every time the report is run. It would be nice if there was a diff mode that could summarize what changed between two runs.
[+] matthewaveryusa|6 years ago|reply
> tested on real-world databases containing 500,000+ tables and 1,000,000+ indexes.

Would love to hear about this use case out of curiosity

[+] tomnipotent|6 years ago|reply
I'm guessing those numbers are cumulative across all the databases scanned, not within one.
[+] KoenDG|6 years ago|reply
half a million tables...

In how many databases? Spread across how many machines? Using hardware that is how old?

It may well be a good project, but statements like that aren't inspiring of confidence.

[+] ahachete|6 years ago|reply
We have been using postgres-checkup for quite a while. Indeed, we have standardized on it as the basis on which we elaborate the health check and performance report analysis that we provide to our support customers.

It is a great project, constantly improving. Keep up with the work!

[+] nbrempel|6 years ago|reply
This seems like a great idea. I used to have a collection of magic Postgres queries that would give me metrics for optimizing data models and indices. I’m surprised there isn’t a visual tool you can bolt into your Postgres install and get valuable metrics out.
[+] 1996|6 years ago|reply
Interesting, but how heavy is it on the database server?
[+] samokhvalov|6 years ago|reply
Hi, the founder of Postgres.ai here.

The footprint is very minimalistic. I have 15+ years of Postgres DBA experience, and what this tool does basically is what I usually do myself with performing Postgres health checks under heavy load. But in automated fashion :)

We very carefully choose approaches and queries that we run on production servers. And it is used under heavy loads (dozens of thousands of TPS) daily.

There are certain places that can be heavy. For example, if you have 1 million indexes (yes, it happens, sometimes), the SELECT query for bloat analysis will be slow. Actually, with default settings, the tool limits itself setting `statement_timeout = '30s'` (can be adjusted using CLI option `--statement-timeout`). So, in databases with a huge number of objects, you should expect that F004 and F005 reports will be missing.

[+] zlepper|6 years ago|reply
This looks really really nice. Does it also work with something like Citus?