top | item 26285662

(no title)

amenonsen | 5 years ago

It doesn't make sense to you because everything in the comment you're replying to is wrong.

Neither log shipping (copying WAL files one by one) nor streaming replication (sending a stream of WAL) works by sending queries. WAL segments are 16MB by default, and the default archive_timeout is 0, not 1 minute (and the archive timeout is not applicable to streaming replication anyway). There is also nothing "adaptive" about the replication—when there is no traffic, there will be ~no changes, and when there are changes, they will be sent to the replica.

I don't understand what the comment is suggesting used to happen in periods of no activity that made replication unusable, but it is also probably incorrect, and has nothing to do with the write amplification problem.

discuss

Quekid5|5 years ago

Thank you for confirming my suspicions :).

user5994461|5 years ago

I was calling the WAL the "queries" to simply, never mind that, it doesn't matter whether it contains the queries or not.

What's important is that the WAL was generated on a periodic basis and of a constant size. Say 16MB every minute. It's pretty much a plain file, that could be stored on S3/FTP.

This had a lot of drawbacks:

- Replicas were measurably late behind the current state, simply because of the built-in delay in "replication".

- It was incredibly inefficient on bandwidth and storage. Consider the time it takes to transfer large files (especially for off-site replicas) and storage costs. That further contributed to poor performance and delay.

- There could be many WAL files generated at once when there were changes happening. They would take FOREVER to be processed. It was commonplace for replicas to fall 5-10 minutes under what I consider to be minimum activity.

Long story short, the replication was reworked in a later version of PostgreSQL (3 or 4 years ago), the part about fixed size and fixed delay is not true anymore.