top | item 44020909

(no title)

samokhvalov | 9 months ago

> we deploy the Postgres instances on Kubernetes via the CloudNativePG operator.

I'm curious if split brain cases already experienced. At scale, it should be so https://github.com/cloudnative-pg/cloudnative-pg/issues/7407

discuss

order

conradludgate|9 months ago

disclaimer, employee at Neon, another postgres hosting provider

My understanding after looking into it, it seems that Xata+SimplyBlock is expected to use ReadWriteOnce persistent volume access modes. This means the claim can only be bound to one node.

I think this solves the split-brain problem because any new postgres readwrite pods on new nodes will fail to bind the volume claim, but it means there's no high-availability possible in the event the node fails. At least, I think that's how kubernetes handles it - I couldn't find too much explaining the failure modes of persistent volumes, but I don't see many other solutions.

At Neon, we solve this issue by having our storage nodes form a consensus protocol with the postgres node. If a new postgres node comes online, they will both contend for multi-paxos leadership. I assume the loser will crash-backoff to reset the in-memory tables so there's no inconsistency if it tries to reclaim the leadership again and wins. In the normal mode with no split-brain and one leader, multi-paxos has low overhead for WAL committing.

yencabulator|9 months ago

How on earth is reporting a broken consistency promise a "discussion". That is dubious behavior from the project management.

samokhvalov|9 months ago

I also don't see any good reasons for this.