top | item 39776789

(no title)

pQd | 1 year ago

We're using BTRFS to host PostgreSQL and MySQL replication slaves. We're snapshoting drives holding data for both every 15 minutes, 1h, 8h and 12h and keep few snapshots for each frequency.

Those replicas are not used for any workload, besides nightly consistency checks for MySQLs via pt-table-checksum to ensure we don't have data drift.

Snapshots are crash consistent. Once in a while they give us ability to very quickly inspect how data looked like few minutes or hours ago. This can be life-saver in case of fat-fingering a production data and saved us from lenghty grepping of backups when we needed to recover few records from a specific table.

Yes, I know soft deletes, audit logs - all of those could help and we do have them, but sometimes that's not enough or not feasible.

Due to it's COW nature BTRFS is far from perfect for data that changes all the time [ databases busy with writes, images of VMs with plenty of disk write activity ]. There's plenty of write amplification, but that can be solved with NVMe drives thrown on the problem.

discuss

order

dilyevsky|1 year ago

How do you avoid heavy fragmentation caused by random writes? Do you disable COW (sounds like "no", given you snapshot)? Or autodefrag (how's performance)?