top | item 38687357

PostgreSQL 16 Bi-Directional Logical Replication

127 points| alexzeitler | 2 years ago |highgo.ca

53 comments

order
[+] perlgeek|2 years ago|reply
The description of preconditions seems a bit incomplete.

To me it seems you need to avoid certain constructs, like UNIQUE constraints. Otherwise you might have a local insert plus a replicated one, both the same value in the unique column, and different nodes reject different inserts.

[+] smilliken|2 years ago|reply
Agreed. Also exclusion constraints (a generalization of unique constraints). Also foreign key constraints, because one database might delete a record simultaneously with another database adding a reference to it.
[+] rvdginste|2 years ago|reply
Would I be wrong in thinking that this "bi-directional logical replication" is only usable in a very limited number of use cases? As you mention, unique constraints pose a problem, and a lot of tables would have unique constraints to protect business keys?

The article references another article on this, the refers to the PostgreSQL documentation:

- conflicts [1]

- restrictions [2]

You need to be very aware of the limitations to decide whether this is usable in a specific context. I don't think this is really intended to be used for "bi-directional" logical replication.

[1] https://www.postgresql.org/docs/current/logical-replication-...

[2] https://www.postgresql.org/docs/current/logical-replication-...

[+] dazzla76|2 years ago|reply
Unique constraints can still be used as long as you accommodate the multiple sources. Something as simple of increment by 2 and the servers alternative odd and even as the starting point.

I used to do this with 3 way DB2 LUW replication back when i was a DBA. It all needs a bit of thought but is fairly simple when it comes down to it.

[+] egamirorrim|2 years ago|reply
I don't know that this article tackled the real elephant in the room for bi-di replication. What happens when the same insert happens on both primaries at once, who wins?
[+] singron|2 years ago|reply
They will each apply the insert locally. Then they will attempt to replicate that insert to each other. Each will attempt to apply the replicated conflicting insert, which will cause an error and halt replication for both nodes.

If you update the same data on both nodes, this is a recipe for almost certain disaster. Postgres is not a distributed database and this doesn't make it one.

[+] winrid|2 years ago|reply
Yeah, that's not what this is for. This is more for "I have one DC in US and one in EU, and so on, with completely different customers, but if I need to I can access the data from any region anywhere".
[+] simonw|2 years ago|reply
Right: Is the idea here that you use UUIDs for inserted rows, hence avoiding duplicate inserts?

But what happens if multiple updates or deletes target the same existing row?

[+] grogers|2 years ago|reply
In the MySQL world, bidirectional replication is common, but with at-most-one of the two being writable (the replication user can still write even if the secondary is read-only). Maybe they meant it to be similar and not truly multi-master? It does seem like a glaring omission given they talk about having two primaries.
[+] solidsnack9000|2 years ago|reply
Maybe the way to use this is to have two nodes but only one of which receives traffic at any one time. With an active-active setup like that, the failover process can be swift.

However, this is kind of a tough thing to do right -- how do really know that only one is receiving traffic? -- and I wonder if there are many cases where it is a compelling alternative to an ordinary failover setup.

[+] Horffupolde|2 years ago|reply
You use shard-compatible ids like snowflake.
[+] tristenharr|2 years ago|reply
This is exciting, I’m hopeful that we get distributed SQLite via something like Turso that can hook into this to stream subsets of data into SQLite edge-caches that can be pushed through the edge onto devices via something like embedded replicas. I know experimental work has been done here with pg_turso (I recently wrote my first few lines of zig to add support for some additional data-types for it) and this bodes well with the recent announcement of Postgres coming to fly.io, I dream of a day when I can make Postgres and SQLite play nicely together.
[+] kroolik|2 years ago|reply
I find the explanation of logical and physical replication to be weird. Seems like only logical replication transfers changes.

I have always thought physical replication is replicating using WAL logs, which is a stream of changes. The logical replication would execute the queries on the replicating nodes, leaving WAL management to the replica.

This is also the reason why physical replication requires same pg versions (there may be diffs to the physical format of data in WAL), whilst logical replication uses domain language, which is independent from physical layout.

[+] MarvinYork|2 years ago|reply
Could this handle more than two databases with this technique or would it generate the mentioned loop when a third database sends the data to multiple databases?
[+] ttfkam|2 years ago|reply
Yes, the implementation distinguishes between a replication client and a user/app client specifically to prevent multiple query invocations. Network latency and CAP tradeoffs still apply of course, but you can absolutely scale well beyond two instances with this.
[+] westurner|2 years ago|reply
/? postgres replication https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

"Pgactive: Active-Active Replication Extension for PostgreSQL on Amazon RDS" (2023) https://news.ycombinator.com/item?id=37838223

/? "pgactive" site:github.com https://www.google.com/search?q=%22pgactive%22+site%3Agithub...

cloudnative-pg/cloudnative-pg: https://github.com/cloudnative-pg/cloudnative-pg :

> primary/standby

cloudnative-pg > Logical replication #13: https://github.com/cloudnative-pg/cloudnative-pg/issues/13

/? logical replication:

https://www.google.com/search?q=logical+replication

pgadmin docs > Publication Dialog; logical replication: https://www.pgadmin.org/docs/pgadmin4/development/publicatio...

https://github.com/dalibo/pg_activity#faq ; pip install `pg_activity[psychopg]` :

> FAQ: I can't see my queries only TPS is shown

(How) Do any ~pg_top tools delineate logical replication activity?

pgcenter > PostgreSQL statistics [virtual tables] (and also /proc) https://github.com/lesovsky/pgcenter#postgresql-statistics

...

"Show HN: ElectricSQL, Postgres to SQLite active-active sync for local-first apps" (2023) from the creators of CRDTs https://news.ycombinator.com/item?id=37590257

electric-sql/electric: https://github.com/electric-sql/electric :

> ElectricSQL is a local-first software platform that makes it easy to develop high-quality, modern apps with instant reactivity, realtime multi-user collaboration and conflict-free offline support.

> Local-first is a new development paradigm where your app code talks directly to an embedded local database and data syncs in the background via active-active database replication. Because the app code talks directly to a local database, apps feel instant. Because data syncs in the background via active-active replication it naturally supports multi-user collaboration and conflict-free offline

"SQLedge: Replicate Postgres to SQLite on the Edge" (2023) https://news.ycombinator.com/item?id=37067980 ; gh-ost, dolt, sqldiff

/?hnlog ctrl-f Consistenc, Consensus :

- "A few notes on message passing" https://news.ycombinator.com/item?id=26535969 :

> The C in CAP theorem is for Consistency [3][4]. [...]

> [3] https://en.wikipedia.org/wiki/Consistency_model

> [4] https://en.wikipedia.org/wiki/CAP_theorem