Introduction to PostgreSQL Indexes

[+] cdiamand|2 months ago|reply

Linking to the postgresql docs since they are very well written and surprisingly enjoyable to read.

https://www.postgresql.org/docs/current/indexes-intro.html

[+] brudgers|2 months ago|reply

Related, Use the Index Luke

[+] jihadjihad|2 months ago|reply

The section on multi-column indexes mirrors how I was taught and how I’ve generally handled such indexes in the past. But is it still true for more recent PG versions? I had an index and query similar to the third example, and IIRC PG was able to use an index, though I believe it was a bitmap index scan.

I am also unsure of the specific perf tradeoffs between index scan types in that case, but when I saw that happen in the EXPLAIN plan it was enough for me to call into question what had been hardcoded wisdom in my mind for quite some time.

Further essential reading is the classic Use The Index, Luke [0] site, and the book is a great buy for the whole team.

0: https://use-the-index-luke.com/

[+] petergeoghegan|2 months ago|reply

> The section on multi-column indexes mirrors how I was taught and how I’ve generally handled such indexes in the past. But is it still true for more recent PG versions?

No, it isn't. PostgreSQL 18 added support for index skip scan:

https://youtu.be/RTXeA5svapg?si=_6q3mj1sJL8oLEWC&t=1366

It's actually possible to use a multicolumn index with a query that only has operators on its lower-order columns in earlier versions. But that requires a full index scan, which is usually very inefficient.

[+] glenjamin|2 months ago|reply

A bitmap index scan allows the database to narrow down which pages could include the data, but then still has to recheck the condition on the contents of those pages - so will still not be as performant as an proper index scan

[+] zozbot234|2 months ago|reply

It would be nice to see out-of-the-box support in PostgreSQL for what's known as incremental view maintenance. It's very much an index in that it gets updated automatically when the underlying data changes, but it supports that for arbitrary views - not just special-cased like ordinary database indexes.

[+] BenoitP|2 months ago|reply

A hard problem, especially wrt to transactions on a moving target.

From memory, handful of projects just dedicated to this dimension of databases: Noria, Materialize, Apache Flink, GCP's Continuous Queries, Apache Spark Streaming Tables, Delta Tables, ClickHouse streaming tables, TimescaleDB, ksqlDB, StreamSQL; and dozens more probably. IIRC, since this is about postgres, there is recently created extension trying to deal with this: pg_ivm

[+] lispisok|2 months ago|reply

If you have timeseries data TimescaleDB has this with continuous aggregates

[+] turbocon|2 months ago|reply

This looks really awesome for Postgres

For general B Tree index resources this has been my got to site for years https://use-the-index-luke.com/

[+] augusteo|2 months ago|reply

Good timing for this article. The multi-column index advice was always confusing because the "leading column" rules had real performance implications, but bitmap index scans made it less catastrophic than the textbooks suggested.

Skip scan in PG 18 changes a lot of that conventional wisdom. Worth updating the mental model for anyone who learned indexing on older versions.

[+] morshu9001|2 months ago|reply

The whole btree vs hash discussion is interesting. Many people assume "ID" columns should be hash, but iirc the default btree works best for those. Also treelike structures are fundamentally better for nearly-sequential value insertion.

The blog post that this links to comes to the opposite conclusion though, showing hash winning the benchmarks.

[+] joaomsa|2 months ago|reply

Essential reading. More in-depth than an introduction, but without being overly impenetrable except to those dealing with the internals.

16 comments