Implement table partitioning

samcheng|9 years ago

Any support for "rolling" partitions? e.g. A partition for data updated less than a day ago, another for data from 2-7 days ago, etc.

I miss this from Oracle; it allows nice index optimizations as the query patterns are different for recent data vs. historical data.

I think it could be set up with a mess of triggers and a cron job... but it would be nice to have a canonical way to do this.

jtc331|9 years ago

The fundamental issue here is that you'd actually have to move the rows between relations given that Postgres maintains separate storage etc. for each. There's no good way to do that.

willvarfar|9 years ago

Living with the cron jobs for a big mysql db, and wishing the DB understood this seemingly common use-case :(

lobster_johnson|9 years ago

How does this work in Oracle? Seeing as the partitioning constraint would be time-dependent, wouldn't it need to re-evaluate it at regular intervals in order to shuffle data around? Is the feature explicitly time-oriented?

rachbelaid|9 years ago

The conversation on the patches are really interesting: https://www.postgresql.org/message-id/flat/55D3093C.5010800@... https://www.postgresql.org/message-id/flat/ad16e2f5-fc7c-cc2...

aidos|9 years ago

I'm always amazed by the PG community - it seems like such a constructive place.

Those patches are absolutely insane. Makes you remember how much hard work goes into building the software you use on a day to day basis.

https://www.postgresql.org/message-id/attachment/45478/0001-...

egeozcan|9 years ago

If you also didn't know what exactly partitioned tables are, here's a nice introduction from Microsoft:

https://technet.microsoft.com/en-us/library/ms190787(v=sql.1...

It is for the SQL server but I assume it would be mostly relevant. Please correct me if I'm wrong.

SideburnsOfDoom|9 years ago

So this is all about partitioning data into different storage files on the same server? What is the main benefit of that?

ktopaz|9 years ago

I don't get it? Table partition is already supported in PostgreSQL now and has been for a long time now (at least since 8.1); Where I work we utilize table partitioning with PostgreSQL 9.4 on the product we're developing.

https://www.postgresql.org/docs/current/static/ddl-partition...

fabian2k|9 years ago

As far as I understand, this is about declarative partioning. So you don't have to implement all the details yourself anymore, you just declare how a table should be partioned instead of defining tables, triggers, ...

roller|9 years ago

The linked patch notes specifically mention the difference between this and table inheritance based partitioning.

  Because table partitioning is less general than table inheritance, it
  is hoped that it will be easier to reason about properties of
  partitions, and therefore that this will serve as a better foundation
  for a variety of possible optimizations, including query planner
  optimizations.

rtkwe|9 years ago

"Supported" in so far as you could basically roll your own implementation, having it managed by the engine is massively more useful and easier to support and setup. A lot of things are supported if you're willing to bodge it together like that.

sapling|9 years ago

It sounds like this is column level partitioning.Each column or columns (based on partitioning expression) is stored as different subtable (or something similar) on disk.If only few columns are frequently accessed, they can be put on cache/faster disk or other neat optimizations for join processing.

tajen|9 years ago

About donations: I believe PostgreSQL now deserves more advertising and marketing to develop its adoption in major companies and, hence, get more funding. If I donate on the website, it says it will help conferences. Where should I donate?

bigato|9 years ago

Supposing the case in which all partitions are on the same disk and that you manage to index your data well enough according to your usage that postgres does not need to do full table scans, are there any additional performance benefits on partitioning?

lobster_johnson|9 years ago

Well, anything that reduces the size of a search space helps performance.

Partitioning can drastically improve query times because the planner can use statistics only from a single partition (assuming the query works on a single partition). Postgres uses (among other things) a range histogram of cardinalities to determine the "selectivity" — how many rows a query is likely going to match. If you have a 1B-row table and you're looking for a value that only occurs once (low cardinality), the statistics won't help all that much. But if you partition it so that you're only looking at 1M rows instead of 1B, the planner can be a lot more precise.

Another point is cache efficiency. You want the cache to contain only "hot" data that's actively being queried for. If most of your queries are against new data, then without partitioning, any single page would likely contain tuples from all time periods, randomly intermixed, and so a cached page will contain a lot of data that's not used by any query. (Note that if you use table clustering, which requires the regularly running of the "CLUSTER" command, then you can achieve the same effect at the expense of having to rewrite the entire table.) If you partition by time, you'd ensure that the cache was being used more optimally.

Write access is also helped by partitioning by cold/hot data: B-tree management is cheaper and more cache-efficient if it doesn't need to reorganize cold data along with the hot. And smaller, frequently changed partitions can be vacuumed/analyzed more frequently, while unchanging partitions can be left alone.

gdulli|9 years ago

1. Flexibility/freedom to distribute partitions in the future if needed.

2. Indexing doesn't work well in all cases. You can be better off scanning entire small partition tables lacking an index on a given column than with a single very large table whether that column has an index or not. (Indexes take up space and need to be read from disk if they don't fit in a memory cache, indexes don't work well for low-cardinality columns, etc.)

3. There are operations you can parallelize on a large number of small/medium tables and perform faster or more conveniently than a single very large table. One of my favorite techniques:

# usage: seq 0 255 |parallel -j16 ./alter-tablespace.sh {}

hexn=`printf "%02x" $1`

psql -Atc "ALTER TABLE tablename_${hexn} SET TABLESPACE new_tblspace" -hxxx -Uxxx xxx

4. A nice side effect of properly/evenly partitioned data you get for free is that you can do certain types of analysis on a single partition (or a few) very quickly and have it represent a sampled version of the data set. You can think of it as another index you get for free.

takeda|9 years ago

To add to responses that you already got there's also a nice use case that partitioning helps with.

When you have table that you constantly inserting large amount of data, and simliarly you are removing old data at the same frequency (i.e. only care about month of data).

If you set partition for example per day, it's way faster to drop old tables than performing a delete.

Jweb_Guru|9 years ago

Yes. Less latch contention for nodes of a single btree index, for instance.

gdulli|9 years ago

This message was confusing to me because I've been using/abusing Postgres inheritance for partitioning for so long that I forgot Postgres didn't technically have a feature called "partitioning".

What I'm looking forward to finding out is if I can take an arbitrary expression on a column and have it derive all the same benefits of range partitioning like constraint exclusion.

vincentdm|9 years ago

I really like this addition. We store a lot of data for different customers, and most of our queries are only about data from a single customer. If I understand it correctly, if we would partition by customer_id, once the query planner is able to take advantage of this new feature, it will be much faster to do such queries as it won't have to wade through rows of data from other customers.

Another common use case is that we want to know an average number for all/some customers. To do this, we run a subquery grouped by customer, and then calculate the average in a surrounding query. I hope that the query builder wil eventually become smart enough to use the GROUP BY clause to distribute this subquery to the different partitions.

tda|9 years ago

I just tried to implement table partitioning in PostgreSQL 9.6 this week. With some triggers and check constraints this seem to work quite nicely, but I was a bit disappointed that hash based partitioning is currently not possible (at least not without extensions).

Will hash based partitioning be included in PostgreSQL 10? The post notes

  A partitioning "column" can be an expression.

so I can assume it will be supported?

jtc331|9 years ago

As long as the expression being hashed doesn't change then yes you could make the expression a hashing function call. If the expression being hashed is mutable there would be issues since the feature doesn't currently support updates that result in rows moving between partitions.

amitlan|9 years ago

Not natively, as in there is no PARTITION BY HASH (<list-of-columns>). What limitations do you face when trying to roll-your-own hash partitioning using check constraints (in 9.6)?

amitlan|9 years ago

...this is the beginning, not the end... https://www.postgresql.org/message-id/CA%2BTgmobTxn2%2B0x96h...

vemv|9 years ago

While seemingly extensive, I don't quite like the commit message.

I doesn't say what TP is, and what its use cases would be. That's the first thing you should say, else how am I going to understand / keep interest in the rest of the text?

pilif|9 years ago

The commit is written by postgres developers for postgres developers. I would say that 90% of the intended audience of that commit message doesn't need an explanation what table partitioning does.

For them this would be needless clutter that's not at all relevant to the commit.

Once we're reaching the 10.0 release, human-friendly release notes, additional manual chapters and sample code will be written for the users to understand (in-fact, the commit linked by this submission already contains quite a bit of additional documentation to be added to the manual).

56 comments