top | item 25302009

(no title)

larzang | 5 years ago

Postgres stores records by primary id sequence. If you have completely random IDs, records inserted at the same time are fragmented across the entire set, which makes accessing them slower if they're meant to be sequential.

Whether this matters entirely depends on the type of data you're storing. If you're storing unrelated data which is always random access anyway, non-sequential ids don't really matter. If you're storing related data which is mostly accessed sequentially, e.g. Impressions filtered by a creation date range, it's definitely better to have sequential ids for non-fragmented storage. UUIDv6 isn't as commonplace in libraries as v1 or v4 but offers a good compromise for this.

discuss

wiredfool|5 years ago

No, Mysql does. Postgres doesn't.

(edit, at least, some common mysql table types do)

(further edit -- You don't actually need a primary key in postgres. You might want one, and depending on how you do it, it's generally good design, but there's no actual requirement for them.)

jeltz|5 years ago

No, I do not think this is a good explanation. What you describe is how InnoDB works.

PostgreSQL stores the actual records separately from the primary key index so the actual records won't fragment but the primary key index will. The records themselves will roughly be stored in insertion order. A random primary key in PostgreSQL mainly means that inserts into the primary key index will be more expensive and that it will bloat the primary key index to some extent.

kevincox|5 years ago

Note that this can actually be beneficial if you are sharding the table as now new inserts will be put on different shards. (Of course ideally you would be inserting sequentially but on different shards)