top | item 4700328

(no title)

hgimenez | 13 years ago

A little-known feature of Postgres' streaming replication is Synchronous Replication or 2-safe replication, available from version 9.1 onward. It allows you to specify that a given commit must be durable on a standby server prior to the server returning to the client.

You can turn this on cluster wide, or just for single transactions using session variables. In your example, you can specify that this update must be synchronously replicated, and as such the consequent request will show the user the updated data.

Read more about it here: http://www.postgresql.org/docs/9.1/static/warm-standby.html#...

discuss

order

recuter|13 years ago

Wait, so you want to block the response until a cross database RPC completes? That sounds like it would not be very performant and possibly introduce some other problems. I'm guessing this is best used sparingly with specific transactional queries.

fdr|13 years ago

Also, I think that mode does not yet support waiting for application of the deltas, only their flushing to disk. That will probably be added some-day, so this is close but no cigar.

In reality, what one could do now is check the 'snapshot' on the follower and the leader and figure out if a change has been propagated or not. This is exposed by the function txid_current_snapshot().

Latency of syncrep is poor, as you suggest, but throughput is about the same -- not everything blocks on acknowledgements from a standby individually, but rather standbys report their progress through a totally ordered stream of changes, and the leader will un-block a "COMMIT" (explicit, or implicit when not using BEGIN) when it sees that a standby has passed the appropriate transaction number (this is a small fib, because I think it think it thinks in terms of a different unit known as WAL Records/XLogPosition, but this is quite close to the truth).

So, for example, one session may only be able to commit a few times a second on a high-latency link, but thirty parallel sessions may not commit at 1/30th the throughput each, but given a case of very small commits closer to 30-times the throughput in aggregate of one client.

Also, 9.2+ has a notion of 'group commit' where this same dynamic (submit deltas, wait on a number to pass by) is applied for local writes relative to the on-file-system crash recovery log, the WAL, even though syncrep to another machine via network was added in 9.1 (i.e. earlier). This is much better at numerous small writes than older versions of Postgres, where it is likely that many backends would issue their own sync requests to disk rather than sharing one.