top | item 37753511

(no title)

exekias | 2 years ago

Hi there, I'm one of the pgroll authors :)

I could be mistaken here, but I believe that pg-osc and pgroll use similar approaches to ensuring no locking or how backfilling happens.

While pg-osc uses a shadow table and switches to it at the end of the process, pgroll creates shadow columns within the existing table and leverages views to expose old and new versions of the schema at the same time. Having both versions available means you can deploy the new version of the client app in parallel to the old one, and perform an instant rollback if needed.

discuss

gvkhna|2 years ago

Thanks for the reply, a write up on pros/cons in these approaches would be fantastic. I have no clue which is better but I believe pgosc is heavily inspired by github/gh-ost, their tool for online schema change for mysql.

brycethornton|2 years ago

Does pgroll have any process to address table bloat after the migration? One of the (many) nice things about pg-osc is that it results in a fresh new table without bloat.

surjection|2 years ago

Another pgroll author here :)

I'm not very familiar with pg-osc, but migrations with pgroll are a two phase process - an 'in progress' phase, during which both old and new versions of the schema are accessible to client applications, and a 'complete' phase after which only the latest version of the schema is available.

To support the 'in progress' phase, some migrations (such as adding a constraint) require creating a new column and backfilling data into it. Triggers are also created to keep both old and new columns in sync. So during this phase there is 'bloat' in the table in the sense that this extra column and the triggers are present.

Once completed however, the old version of this column is dropped from the table along with any triggers so there there is no bloat left behind after the migration is done.