So basically what this does is avoid locking the database while migrating? Obviously (and the README mentions this) the two releases have to be backwards- and forwards-compatible, so with this backend you avoid locking for other processes while migrating.
Very useful, especially for long migrations, but I would like to see a bit more detail about how this library achieves this, what the caveats are, etc.
EDIT: Never mind, there's a comprehensive "how it works" section farther down, I just needed to scroll far enough. This is very useful.
- it doesn't use transactions, so if migration will down, then you will need to fix state manually (one point of improvement), however you cannot run `CREATE INDEX CONCURRENTLY` in transaction.
- it can use `CHECK (column IS NOT NULL)` constraint for safe `NOT NULL` replacement for huge tables, that isn't compatible with standard django behavior.
Looks like this handles a bunch of cases around locking you can run into, which seems a noble goal. I like the approach of making a matrix of all the migration operations and figuring out how to work around the locks, kudos.
There are definitely some wonderful ways to mess things up when starting out with postgres migrations. Nothing quite like the surprise you get the first time you rewrite every tuple on a table with 20M rows because you added a column with a default value (no longer an issue with the latest postgres).
As pointed out in the docs, your code must be prepared to support the schema both before and after the migration. Not that I use django anyway, but normally I'm more worried about the interlacing of the code and db changes to keep things running smoothy. Migrate a bit, release some code, migrate some more etc.
Another alternative would be to use postgres savepoints, which are like transactions inside transactions, as a wrapper around each migration. You can do the same thing - set lock_timeout and catch errors when those values are exceeded, and try the transaction again.
Provide an option to run the handful of operations that can't be run inside a transaction as an escape hatch, and then you can retain the ability to run migrations inside transactions, which is usually a good thing.
Unfortunately savepoints live only in transaction and ACCESS EXCLUSIVE will be released only on whole transaction commit/rollback, so look like no benefits with savepoints for schema migrations to compare with plain transaction.
This is awesome. It's not a silver bullet and won't solve all case. You'll still need to design in degraded modes for your application for when parts of it become unusable (that's you should be doing anyway), but this covers a lot of your likely needs.
Is there any equivalent for django+mysql? I historically found more tools for online schema changes in the mysql sphere, such as gh-ost ( https://github.com/github/gh-ost )
gh-ost is hardly useful for anyone outside of github.
It's predicated on the fact that you don't use foreign keys. Now why would someone use MySQL without FKs... is beyond me, but I'm sure they have their reasons.
There have been some messages on the development mailing list about adding support for create index concurrently when using Postgresql, which is a big pain point for larger migrations.
It's a bit tricky as you cannot use it in a transaction though
Constraints creation also tricky because in transaction it take ACCESS EXCLUSIVE lock (eg. lock table on READ/WRITE) and spend time to constraint validation (CHECK, NOT NULL, FOREIGN KEY), so lock will be released only on transaction commit/rollback.
Same here. Our backend team uses Python, SQLAlchemy and PG. They are looking for a tool to handle migrations and schema versionings. Any recommendations?
[+] [-] StavrosK|7 years ago|reply
Very useful, especially for long migrations, but I would like to see a bit more detail about how this library achieves this, what the caveats are, etc.
EDIT: Never mind, there's a comprehensive "how it works" section farther down, I just needed to scroll far enough. This is very useful.
[+] [-] tbicr|7 years ago|reply
- it doesn't use transactions, so if migration will down, then you will need to fix state manually (one point of improvement), however you cannot run `CREATE INDEX CONCURRENTLY` in transaction.
- it can use `CHECK (column IS NOT NULL)` constraint for safe `NOT NULL` replacement for huge tables, that isn't compatible with standard django behavior.
So all this cases highlighted in README.
[+] [-] aidos|7 years ago|reply
There are definitely some wonderful ways to mess things up when starting out with postgres migrations. Nothing quite like the surprise you get the first time you rewrite every tuple on a table with 20M rows because you added a column with a default value (no longer an issue with the latest postgres).
As pointed out in the docs, your code must be prepared to support the schema both before and after the migration. Not that I use django anyway, but normally I'm more worried about the interlacing of the code and db changes to keep things running smoothy. Migrate a bit, release some code, migrate some more etc.
[+] [-] bmdavi3|7 years ago|reply
Another alternative would be to use postgres savepoints, which are like transactions inside transactions, as a wrapper around each migration. You can do the same thing - set lock_timeout and catch errors when those values are exceeded, and try the transaction again.
Provide an option to run the handful of operations that can't be run inside a transaction as an escape hatch, and then you can retain the ability to run migrations inside transactions, which is usually a good thing.
[+] [-] tbicr|7 years ago|reply
[+] [-] rbanffy|7 years ago|reply
[+] [-] etaioinshrdlu|7 years ago|reply
[+] [-] rantanplan|7 years ago|reply
It's predicated on the fact that you don't use foreign keys. Now why would someone use MySQL without FKs... is beyond me, but I'm sure they have their reasons.
MySQL has this https://dev.mysql.com/doc/refman/5.6/en/innodb-online-ddl.ht... for Online Schema migration and
MariaDB has ALTER ONLINE https://mariadb.com/kb/en/library/alter-table/
They both work with triggers + ghost tables, so they don't need transactions.
[+] [-] Doxin|7 years ago|reply
[+] [-] leowoo91|7 years ago|reply
[+] [-] orf|7 years ago|reply
It's a bit tricky as you cannot use it in a transaction though
[+] [-] tbicr|7 years ago|reply
[+] [-] crescentfresh|7 years ago|reply
[+] [-] ggregoire|7 years ago|reply
[+] [-] fredley|7 years ago|reply