(no title)
ydant | 1 year ago
A number of the financial and medical institutions I deal with re-generate PDFs every time you request them, but the content is 99-100% identical. Sometimes just a date changes. So I use a perceptual hash and content comparison to automate detecting truly new documents vs. ones that are only slightly changed.
No comments yet.