(no title)
ndr | 1 month ago
It requires a particular dance if you ever want to add/delete a field and make sure both new-code and old-code work with both new-schema and old-schema.
The workaround I found was to run tests with new-schema+old-code in CI when I have schema changes, and then `makemigrations` before deploying new-code.
Are there better patterns beyond "oh you can just be careful"?
tmarice|1 month ago
* https://github.com/tbicr/django-pg-zero-downtime-migrations
* https://docs.gitlab.com/development/migration_style_guide/
* https://pankrat.github.io/2015/django-migrations-without-dow...
* https://www.caktusgroup.com/blog/2021/05/25/django-migration...
* https://openedx.atlassian.net/wiki/spaces/AC/pages/23003228/...
Generally it's also advisable to set a statement timeout for migrations otherwise you can end up with unintended downtime -- ALTER TABLE operations very often require ACCESS EXCLUSIVE lock, and if you're migrating a table that already has an e.g. very long SELECT operation from a background task on it, all other SELECTs will queue up behind the migration and cause request timeouts.
There are some cases you can work around this limitation by manually composing operations that require less strict locks, but in our case, it was much simpler to just make sure all Celery workers were stopped during migrations.
rorylaitila|1 month ago
senko|1 month ago
1. Make a schema migration that will work both with old and new code
2. Make a code change
3. Clean up schema migration
Example: deleting a field:
1. Schema migration to make the column optional
2. Remove the field in the code
3. Schema migration to remove the column
Yes, it's more complex than creating one schema migration, but that's the price you pay for zero-downtime. If you can relax that to "1s downtime midnight on sunday", you can keep things simpler. And if you do so many schema migrations you need such things often ... I would submit you're holding it wrong :)
ndr|1 month ago
Adding a field needs a default_db, otherwise old-code fails to `INSERT`. You need to audit all the `create`-like calls otherwise.
Deleting similarly will make old-code fail all `SELECT`s.
For deletion I need a special 3-step dance with managed=False for one deploy. And for all of these I need to run old-tests on new-schema to see if there's some usage any member of our team missed.
jgavris|1 month ago
aljarry|1 month ago
1. Create new fields in the DB.
2. Make the code fill in the old fields and the new fields.
3. Make the code read from new fields.
4. Stop the code from filling old fields.
5. Remove the old fields.
Personally, I wouldn't use it until I really need it. But a simpler form is good: do the required schema changes (additive) iteratively, 1 iteration earlier than code changes. Do the destructive changes 1 iteration after your code stops using parts of the schema. There's opposite handling of things like "make non-nullable field nullable" and "make nullable field non-nullable", but that's part of the price of smooth operations.
Izkata|1 month ago
m000|1 month ago
When you add new stuff or make benign modifications to the schema (e.g. add an index somewhere), you won't notice a thing.
If the introduced schema changes are not compatible with the old code, you may get a few ProgramingErrors raised from the old pods, before they are replaced. Which is usually acceptable.
There are still some changes that may require planning for downtime, or some other sort of special handling. E.g. upgrading a SmallIntegerField to an IntegerField in a frequently written table with millions of rows.
ndr|1 month ago
rtpg|1 month ago
https://rtpg.co/2021/06/07/changes-checklist.html
I've been meaning to write an interactive version to sort of "prove" that you really can't do much better than this, at least in general cases.
jgavris|1 month ago