top | item 24749479

(no title)

nulptr | 5 years ago

I think I found a way to do it with ~8GB. As you mentioned you only need 2^30 tries due to birthday paradox.

But, instead of using a hash table to store (seed, hash) pairs, we can just use a list of only hashes. To generate the next hash we use current hash. i.e. next_hash = hash("example.com/" + string(int64(cur_hash))). So now, we just store 60-bit hash values which takes up 2^30 * 60 bits = 8gigs.

The only other problem is detecting if a hash already exists, which I think can be done using a space efficient probabilistic hashset like a bloom filter (and when the bloom filter says the hash already exists, we do a linear scan to find where it exists and we have a possible collision).

discuss

order

penteract|5 years ago

Actually, using that method, you'd need much less than 8GB, since a collision would mean you'd hit a cycle, so you could store something like 1 value for every 2^10 iterations. When you found a collision with one of those, you could examine the interval more carefully to find exactly where the collision was.

nulptr|5 years ago

Ahh yes, that's very clever. Thanks for sharing :)

Thorrez|5 years ago

Interesting, that's similar to rainbow tables.