(no title)
halffullbrain | 5 months ago
Some of the same techniques apply, like using domain primitives, but some PII (like names and addresses) is eventually templated into flatter (text) values, and processed by other layers which do not recognize 'brands' as suggested.
Data scanners: Regexes are fine for SSNs and the like, but to be really effective, one would need a full-on Named Entity Recognition in the pipeline, perhaps just as a canary. (Wait, that might actually work?)
Dataflow analysis and control applies in a BIG way, e.g. separating an audit log for forensics, where you really NEED the PII, from a technical log which the SREs can dig into without being suspected of stealing sensitive info. Start there.
No comments yet.