(no title)
willquack | 11 days ago
VDiff (v2) only compares the source and destination at a specific point in time with resume only comparing rows with PK higher than the last one compared before it was paused. I assume this means:
1. VDiff doesn't catch updates to rows with PK lower than the point it was paused which could have become corrupt, and
2. VDiff doesn't continuously validate cdc changes meaning (unless you enforce extra downtime to run / resume a vdiff) you can never be 100% sure if your data is valid before SwitchTraffic
I'm curious if this is something customers even care about, or is point in time data validation sufficient enough to catch any issues that could occur during migrations?
mattlord|11 days ago
But there's also nothing stopping you from doing a new VDiff to cover all data at that later point in time.
freakynit|10 days ago
willquack|10 days ago
I think it's still the same issue where data modified after the VDiff point in time isn't validated before SwitchTraffic. I'm mostly curious how vitess users handle this case, or if any users even care about about this case in the first place?
Is there no demand for continuous data validation similar to what TiDB offers?
Do people who care about 100% correct data validation just accept the downtime required to run a full VDiff before SwitchTraffic?