top | item 28158717

(no title)

dsies | 4 years ago

^ 100% this. Do not delete events. They are your source of truth - you shouldn't really even modify them but stripping out PII is "alright".

Re 24M+ records: create a batch runner that goes through "jobs" to perform stripping/cleaning tasks. To store state (and to organize cleaners), use a distributed store such as etcd - that way you can bookmark where you were at in the cleaning process.

discuss

order

No comments yet.