If I understand correctly, the full URL is hashed (or at least the first n bits of the full URL). It really isn't computationally expensive to work out undesirable URL's being visited because the undesirable domains are known, and it would be trivial for them to periodically crawl the undesirable domains for all known internal URLs and build a dictionary of hashes.
No comments yet.