top | item 22294261

Pgcat – Enhanced PostgreSQL logical replication

145 points| jinhualuo | 6 years ago |github.com

21 comments

order
[+] anarazel|6 years ago|reply
> Instead of worker processes and low-level row ingression, pgcat uses sql template to apply high-level sql commands, so it could make largest compatibility on target table type. It is written in golang and runs in separate process.

Doing "pgcat uses sql template to apply high-level sql commands" unconditionally is somewhat sad. It has WAY higher overhead than the referenced "low-level row ingression". It makes sense to so in some cases (into a view, triggers, apply functions, ...), but always doesn't seem great.

[+] jinhualuo|6 years ago|reply
Yes, the template generated sql would bring some overhead. But it's the cost for flexibility: The target table would not be limited to base table. It could be a view, a foreign table, partitioned table, or citus distributed table. And, it also support table name mapping.
[+] ahachete|6 years ago|reply
I'm not a big fan of conflict resolution, and LWW is a pretty bad choice in most cases. Conflict resolution means data loss (it violates the D of ACID, as an user expects that a transaction that was committed would endure, but it is deleted or overwritten by another one), and most likely C.

The effects of this are hard to predict and can hit pretty hard your business logic. This is not the kind of expectation you get from Postgres. This would fail Jepsen at second 1.

So I'd advise not to use it. The only use case I see for bi-directional replication is where you ensure that all the master nodes have disjoint sets of data, and the logical replication just aggregates them, without a possibility for conflicts.

If you really want multi-master you need a system that works on top of a consensus protocol. And there are definitely tools out there for this. But not LWW.

[+] jinhualuo|6 years ago|reply
Yes, if multi-masters change the same rows of same table, it would cause conflict.

Consensus protocol like raft, has limitation: it requires low delay among the nodes, so normally, it would be deployed in the same data-center. Moreover, if all changes (conflict or not), handled by consensus protocol, the performance would be bad. In fact, data-centers are normally loose coupling (the application normally choose the same data-center as the data source/target). So data-centers just need somehow light-weighted conflict resolution.

pgcat lww, just like Cassandra, provides counter column concept, which is used to retain each change among data-centers, and simple aggregation helper function there. pgcat lww is done by pg trigger, so if you need customization of your business, you could modify the trigger function to fit it.

[+] bonesmoses|6 years ago|reply
LWW / LUW systems are a fine first approximation, but generally do not fulfill the strictest guarantees and desirable outcomes for a database without accounting for a lot of edge cases.

A good conflict management system will have that only as a default. Better ones will also provide mechanisms for overriding LWW in specific scenarios on a per table or even row basis depending on the in/out values from both nodes. The trick is building these conflict management rules so they're independently deterministic, or you'll end up with node divergence.

Normally we just do what you said, and strongly recommend interacting with sticky sessions, geographically disparate segments, etc., to prevent the problem before it happens. It's far easier to avoid a conflict than to correct it after the fact.

While a consensus model "fixes" this, it drastically cuts your throughput to a function of the latency of the most remote node in the quorum, essentially defeating the purpose of having a local Master node. Ideally you would reserve this for specific transactions that are sensitive in that they require absolute consistency regardless of incurred latency (See PACELC). And even this requires a distributed deadlock detector unless more than just the transactions are managed through the consensus layer.

BDR for example provides both of these models for the above reasons. Multi-Master is not an easy problem to solve, and requires a multi-faceted approach to even come close to something usable in a generalized scenario.

[+] _-___________-_|6 years ago|reply
What use cases are there for a replication that doesn't necessarily produce an accurate replica?
[+] akdor1154|6 years ago|reply
Looks really nice! Out of interest, at what stage would the dbas here consider running a newish third party project such as this on a production workload?
[+] edoceo|6 years ago|reply
I use pg logical in production today on a small system (I'm only a couple 100 gig). This tool looks really promising. However, for my core data I look at 6mo of replica realworld load in test before putting anywhere close to live.

Also, for some bits we also run some near shit-hack custom pg extensions in production so...I'm not risk adverse.

[+] CuriouslyC|6 years ago|reply
Personally I would test this in a greenfield project first, unless this solved a serious problem I was facing.