(no title)
lbriner | 1 year ago
> How would I monitor the replica? Same. You can also use something like HA proxy calling a postgres CLI command to connect to the instance
> How do I failover? Mostly, you probably want to do this manually because there can be data loss and you want to make sure the risk is worth it. I simply use repmgr for this.
> Do I need 2 replicas? It's usually good to have at least 3 (1 master and 2 slaves) but mostly so that if one fails, you still have 2 remaining i.e. time to get a 3rd back online
> How do I failback? Again, very easy with repmgr, you just tell the primary to be the primary again. The failed over primary gets stopped, the original primary gets fast-forwarded and promoted to primary and everything else gets told to follow.
I do agree that this space for postgres is very fragmented and some tools appear abandoned but its pretty straight-forward with just postgres + barman + repmgr, I have a series of vides on YouTube if you are interested but I am not a Postgres expert so please no hating :-) https://youtu.be/YM41mLZQxzE
cheald|1 year ago
If you're doing manual failover, you don't need an odd number of nodes in the cluster (since you aren't looking for quorum to automatically resolve split-brain like you would be with tools Elasticsearch or redis-sentinel), so for us it's just a question of "how long does it take to get back online if we lose the primary" (answer: as long as it takes to determine that we need to do a switch and invoke repmgr switchover), and "how robust are we against catastrophic failure" (answer: we can recover our DB from a very-close-to-live barman backup from the same DC, or from an offsite DC if the primary DC got hit by an airplane or something).