This list is immediately useful for validating user-created new passwords. Just stop with the bizarre rules about having uppercase, lowercase, symbols, numbers, length, etc. Instead require a string not in the top 10K (or 100K, or 1M) most popular.
"Your password must be between 12 to 46 characters long and must include at least one number, upper case character, special character, kanji character, rune and quadratic equation"
Since the list is only rarely updated, this seems a perfect application Botelho's for minimal perfect hashing algorithm[1]. At about 8 bits per item storage and constant lookup it would be quite practical to use the top 32 Million list (appeared at least 10 times).
Is there really any reason to multiply the downloads by 4 just to have different compression methods?
It seems to me like that just needlessly dilutes the seed swarm and wastes space, since pretty much any modern archive reader can unpack all the provided formats. Sure, specialized command-line utils can't, but if you're using one of those then you probably know which one to use for a given format.
I don't understand: Why do you assume that password checkers keep their lists in alphabetically sorted form rather than just to load the whole thing into a db table with an index on it?
As I was looking around for the files to make this project, on SecLists, Weakpass, and Hashes.org, most of the files were in alphabetical order. This was especially true for the larger files.
Not to sound negative, but is this something we should really be exposing? It feels like the only ones who gain from this are hackers and password crackers, no?
Like others have mentioned, the criminal hackers are already up on this stuff. It's more beneficial to the security community as a whole to expose these things so you know how to implement effective security policies. When your trying to convince management to implement something security related it's one thing to say "We think this is a threat" verses saying "hey watch this" and demonstrating the threat.
It's not like bad guys don't already have wordlists, or that they are new in any way. Having good ones available, in the open, for everyone to use, provides a net benefit in the long run imo.
It's already exposed. That's the point. The adversary has this intel. It would be irresponsible not to study and publish it, thereby reducing their informational edge.
We've banned this account for abusing the site, including posting many unsubstantive comments after we asked you to stop. Please don't create accounts to break the site rules with.
•These lists are for LAWFUL, ETHICAL AND EDUCATIONAL PURPOSES ONLY.
Yeah, like that is going to stop people from doing nefarious things with this info. If you feel the need to post this screechy, all caps disclaimer, maybe rethink your project entirely?
[+] [-] lucasgonze|8 years ago|reply
[+] [-] microwavecamera|8 years ago|reply
[+] [-] bradleyjg|8 years ago|reply
[1] http://cmph.sourceforge.net/papers/tr06.pdf
[+] [-] expertentipp|8 years ago|reply
[+] [-] gfody|8 years ago|reply
edit: on 2nd thought looks like a bloom filter for 5B entries at p=0.01 would be ~5GB, so not exactly convenient
[+] [-] berzerk0|8 years ago|reply
Popularity was based on how many they appeared in files that had all duplicates removed (in reference to themselves)
The smallest file had passwords that appeared 75+ times, and the largest file had passwords that appeared 2+ times.
The top 195 Thousand (which appeared 25+ times in analysis) clocks in at 803kb as a text file with nothing but the passwords themselves
[+] [-] surement|8 years ago|reply
[+] [-] Roritharr|8 years ago|reply
[+] [-] fasteo|8 years ago|reply
[+] [-] berzerk0|8 years ago|reply
The main page contains links to Mega.NZ alternative downloads. Will be fixed shortly, apologies for the inconvenience.
[+] [-] kaslai|8 years ago|reply
[+] [-] infinisil|8 years ago|reply
[+] [-] berzerk0|8 years ago|reply
Two out of the 12 aren't, but every wordlist can be downloaded via torrent in at least 3 compressed formats, including .7z
[+] [-] jacquesm|8 years ago|reply
[+] [-] berzerk0|8 years ago|reply
[+] [-] smaili|8 years ago|reply
[+] [-] microwavecamera|8 years ago|reply
[+] [-] mvdwoord|8 years ago|reply
It's not like bad guys don't already have wordlists, or that they are new in any way. Having good ones available, in the open, for everyone to use, provides a net benefit in the long run imo.
[+] [-] qeternity|8 years ago|reply
[+] [-] ignoramous|8 years ago|reply
Also see: Security through Obscurity.
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] macscam|8 years ago|reply
[+] [-] bbcbasic|8 years ago|reply
[+] [-] dang|8 years ago|reply
[+] [-] Mz|8 years ago|reply
Yeah, like that is going to stop people from doing nefarious things with this info. If you feel the need to post this screechy, all caps disclaimer, maybe rethink your project entirely?
Geez.