Why the password "this is fun" is 10 times more secure than "J4fS!2"

[+] ianferrel|15 years ago|reply

The author relies on the assumption that one can try 100 passwords per second (or, fewer, in the case of an extra delay), but that doesn't correspond to real-world security breaches.

The danger of having an insecure password is not that someone will bombard the server with login requests. That can easily be detected and stopped by even the most cursory of server security. The danger is that they crack the server and get the list of password hashes, at which point the time to crack a password is dictated by the hardware at their disposal and the hashing algorithm. Your server capacity or timeout protocols are irrelevant.

So, the 100 attempts/sec number is essentially a fiction. It applies only to a manufactured threat. The real threat is much worse, which means that a password like "this is fun" is not reasonably secure.

[+] asharp|15 years ago|reply

To be more specific they are dictated by the hashing algorithm and how it is set up.

Say you have something like a straight md5/sha256/etc. It's fairly likely that there exist rainbow tables that will insta(for some reasonable value of insta) crack any reasonable password.

On the other hand, if you use a salted hash you arn't vulnerable to rainbow tables, but an attacker can still try an altogether silly number of passwords per second given enough hardware. (or again, more specifically GPUS against most common salted hashes).

On the other other hand, if you use a memory hard key derivation function (scrypt/etc.), you can quite easily set things up such that even with a ridiculous amount of hardware it is infeasible to launch any sort of attack. The problem then, is that the harder you make it for attackers to attack your passwords, the slower normal logins are for you, affecting scaling.

So at the end of the day you need to weight off how much of a problem this is in the specific circumstances you are in which will then dictate the solution you can provide.

[+] andylei|15 years ago|reply

i think the point is that increased password length can make easy to remember passwords that are more secure than shorter, hard to remember passwords.

even though "this is fun" isn't secure in your scenario, its still more secure than the "j63<2a" password in the same scenario.

[+] ominous_prime|15 years ago|reply

> The danger is that they crack the server and get the list of password hashes

This isn't an argument either way. The service poorly handling passwords is even more culpable than the user with poor password complexity. Passwords should never be stored as plain hashes.

[+] phamilton|15 years ago|reply

Your argument does essentially shoot down any merit to 5 second login attempt delays, and a cap on the number of attempts in 15 minutes.

[+] tel|15 years ago|reply

As I wrote in another comment, high entropy passwords by definition must be hard to remember. It's not strictly true, but it definitely refutes the title of this submission.

While I think this post is rather optimistic in its calculations — using maximum entropy distributions, for instance — it does bring up a good point: Personally memorable nonsense sentences are rather high entropy.

Actual practical guessing is not pure distribution entropy but instead closer to the KL divergence since an intelligent brute force guesser has to make assumptions about the password distribution in order to reap benefits. If your password comes from an expected distribution (letters in English words, words in English sentences) you're losing a whole lot of potential entropy, traded for particular memorability.

If you can hedge between those bets, though, you're in a good place. "this is fun" is not actually terribly secure compared to "J4fS!2" unless you're actually attacked by a uniform dictionary brute force search. "slurping radicals debilitate enzymatically" is super high entropy and quite likely easier to remember than "J4fS!2".

[+] qjz|15 years ago|reply

Yet, if your Gawker password had been "passwords delight entropy enthusiasts", it would have been cracked immediately, due to the 8-character truncation. "i<3turtles" would probably have been much safer. Unfortunately, users rarely know anything about the systems used for authentication, so any advice should keep poor systems in mind. That's why I frontload my passwords with a few characters of high entropy before appending a longer high entropy (but memorable) phrase.

[+] tzs|15 years ago|reply

I once did the following for a password that I wanted to be able to remember, but that I wanted to be very secure against attack. It was easy to remember, yet very effective. How does this fit into the entropy vs. rememberability thing?

1. Take a 100x100 binary matrix, initialized to 0.

2. Change some bits to 1, in a pattern that I memorized. It was just something simple, like my initials in dot-matrix form.

3. Apply Conway's Life cellular automata rules to the matrix, with no wrap, doing 108 iterations.

4. Read out the final state of each cell row by row, as a an ASCII representation of a binary string of 10000 bits.

5. Run that through MD5. Iterate this 1960 times.

6. The final MD5, as an ASCII representation of a 128-bit hex number, was my password.

I did NOT have any of the software for this on any computer. Whenever I needed this password, I'd write a Life program, enter my memorized pattern, run it, do the MD5 stuff, and then delete the program.

Effectively, my password was the combination of the Life algorithm, the size of the input, the input pattern, the number of iterations to run of Life, and the algorithm for converting the final Life state into an ASCII password string.

There seems to be a lot of entropy here. Yet it was pretty easy to remember.

[+] cookiecaper|15 years ago|reply

I think this is the real key. Sentences as passwords are really hard to crack as long as you choose a reasonably long sentence that you can remember easily but is not overly simplistic or obvious, and if you include a few typos or non-standard capitalization and/or punctuation it becomes that much more difficult to crack.

A sentence is the only way you'll get normal users to memorize a password > 6 chars. I think a sentence is much better than what most people end using, which is a personally significant word (like a child, spouse, or pet's name) + a brief personally significant numeric suffix (like birth year).

[+] tlrobinson|15 years ago|reply

In practice is a long nonsense sentence of English words really any better than an equally long coherent sentence of English words? Are cracking tools really that intelligent?

[+] patio11|15 years ago|reply

The danger with semantically meaningful passphrases is that they have a lot less entropy than you think they do. I still use them for everything, but it is something to be aware of. (You can get more by e.g. padding it with a number, doing the usual l33tspeak tricks, etc.)

[+] Florin_Andrei|15 years ago|reply

As a purely theoretical question - if they are _usual_ tricks then wouldn't they also exhibit low entropy on some level?

[+] chadgeidel|15 years ago|reply

As others have mentioned, I don't think "l33t" is going to help you there. Remember - a dictionary attack isn't based on "THE dictionary" (Websters et al.) but A dictionary. If I were to write a password cracker, the first thing I would do was take a list of common passwords and "l33tify" them (trivial) to create my cracking dictionary. Of course, this is still a brute force attack, and all that entails.

[+] ay|15 years ago|reply

The usual trick is to use only the first letters of each word: e.g. something like "Mttboom1cuia!Igdtdts1flIkohd." is reasonably strong and is easy to remember (it's just a song).

Or you use all the characters ?

[+] bhrgunatha|15 years ago|reply

I wonder if word substitution would be better than character substitution?

cucumber of our discontent pot calls kettle soapy

Still relatively easy to remember but with less chance of being in some dictionary.

[+] ominous_prime|15 years ago|reply

remember though, the l33t letter replacements are in the dictionaries for common passwords, and used in brute-force algorithms.

[+] cdavid|15 years ago|reply

Or more simply, use a passphrase where words are truely selected by random (diceware).

[+] juiceandjuice|15 years ago|reply

Assume there are 7500 very commonly used english words. A three word sentence, all in lower case, would yield 421 Billion permutations. Let's say, of those 421 billion permutations, which don't tense or plural nouns about 1 in 5 are familiar english language constructs, which drops down to around 20 billion permutations. In this case, if we took all capital letters, all numbers, and 6 punctuation characters, we'd end up with ~67 unique characters, for a combined 82 billion permutations.

Furthermore, like Richard Feynman discovered in Los Alamos, you could narrow down the possibilities of combinations if you know something about a person. You could probably build profiled dictionary attacks and reduce possibilities a lot.

So, is it more secure? No, it's maybe equally secure, but it would completely depend on the attack. A combination of capital letters would probably be more secure though.

[+] juiceandjuice|15 years ago|reply

I thought about this a bit more and, in reality, the amount of english words used could probably be cut down to less than 1000, which would cut down to 1/400th of my original figure. So no, I don't think it's more secure against advanced dictionary attacks.

[+] T-hawk|15 years ago|reply

>if we took all capital letters, all numbers, and 6 punctuation characters, we'd end up with ~67 unique characters, for a combined 82 billion permutations.

82 billion may sound like a lot but it is not. Suppose you had a few multi-core machines that could collectively get up to 1 million crack attempts per second. Now you can crack that password in 82,000 seconds, or under one day.

[+] ezy|15 years ago|reply

One obvious solution: Don't use english, and don't use standard word order.

[+] cool-RR|15 years ago|reply

One advantage of gibberish passwords like "b923F$5mvA" is that if someone looks at your fingers while typing them, he'll have a hard time figuring them out from your keypresses, whereas if you typed "this is fun", it would be much easier.

Ditto for when someone has a visual glimpse of your password which is only a few seconds long. (e.g. someone looked at your laptop screen while you got an email with your password from an irresponsible website.)

[+] jjcm|15 years ago|reply

This should be fairly obvious to anyone who's done any sort of combinatorics - you're saying that a 10 character password using symbols and lower case letters is more complex than a 6 character password using 36 more characters in the character pool. Anyone who's even glanced at password complexity research will be able to tell you that. To break down the numbers though, a 10 character password using lowercase letters and symbols (spaces) has 30155888444737843000 possible combinations. A six character password has 735091890625 combinations (around 1/4000th of the complexity, assuming a brute force approach). While the author also takes into account the possibilities of using a dictionary attack, you can't really tie a number to the search space for that. It depends on the breadth of what the program will go to. Will it check alternate spellings (color/colour)? Will it check for apostrophes? Foreign languages? Etc.

A while back I wrote a small piece of JS to demonstrate to some people the complexity growth in passwords. Some people didn't believe me that asdfasdfasdf was more complex than Fc34!j_, and this was the end result. Feel free to play with it. The source is rather simple as well:

http://files.jjcm.org/jspass

[+] synnik|15 years ago|reply

There is a huge logic gap here. He is comparing 11 characters passwords to 6 character passwords. The difference in length also will account for a significant difference in the time required with brute force.

I think what he is trying to show is that is that the lower security of using multiple common words on a password with 11 characters is still greater security than a random 6 character password, and still quite acceptable.

[+] alexqgb|15 years ago|reply

Actually, I think that's his point - by using a longer passphrase instead of a shorter passcode, you get something inherently more secure AND more memorable (added bonus: no post-its, which means one more layer of security).

[+] seles|15 years ago|reply

This isn't a logic cap. The 11 character password is easier to remember and faster to type than the 6 letter one, so it is perfectly fair to compare it.

[+] _b8r0|15 years ago|reply

It's interesting. The author shows some fundamental misunderstandings and makes assumptions that are not necessarily based on real-world situations to present an idea that longer strings with more recognisable characters (e.g. passphrases) are better than shorter strings with larger keyspaces. If you pick two data points you can actually fiddle with the numbers to present either side of the argument as the truth. For example:

A full 16-bit unicode 2 character password has 65 536^2 or 4,294,967,296 permutations to work through.

A lowercase alphabetic password of 6 characters in length has 26^6 = 308,915,776 permutations.

Of course there's a tradeoff involved, and that tradeoff is what IT departments try to manage, with mixed success. It's easier for the software to determine whether or not the password contains methods of increasing the keyspace than whether or not the user has typed out a 200 character long series of 'A' characters, so that's what they use. I don't know whether or not increased length has a higher risk of collisions for some algorithms (that's tptacek territory, not mine). Over time, software products have been guided by best practice standards from organisations like COBIT that define and mandate complex passwords based on keyspace rather than length alone.

Secondly, there is a difference between an online brute force and an offline brute force. Depending on the algorithm, with the right kit (or Amazon EC2 instances) you can get billions of hashes per second to crack a password hash offline. At that point your increased length only matters if the attacker doesn't know about the complexity. The samples provided are terrible as they're all lower case with a space at most. This is the poor end of the trade-off. To brute force a SHA-1 hash of the word 'sum' on my 2 year old laptop takes less than a second.

Online brute forcing (e.g. brute forcing a web form) is generally something you're not going to do if you're looking to compromise a web account, unless it is a specifically targeted attack (e.g. the user is an admin or a specific person of interest). In these situations your brute force rate is dependent on your network throughput, the application's ability to respond to concurrent requests and any other factors that may affect it (such as any monitoring system). Your web application on a linode slice will probably choke out between 40 and a hundred attempts per second (and you'll notice it unless you're blind or have no performance reporting). If you can get past the automation detection in larger sites, like Amazon, Google, Twitter etc. you'll probably be able to go much, much higher. For this reason, web site brute forces tend to be dictionary based, or at most on dictionaries and a number.

Ultimately when choosing a password you need to consider what you're defending against. If you own the box or the app, chances are you trust the defences. If you're going to change the password every few months then maybe you will choose a weaker password. If you don't own the kit and you're not intending to use the password, then use a tool such as Keepass[1] and generate the passwords yourself. That way it doesn't matter what J Arthur Random says on the Internet, you won't need to remember the passwords at all.

[1] - http://keepass.info/

[+] neuroelectronic|15 years ago|reply

Right, but to put it simply, character count is the only metric that matters. If your password is a sentence that's 44 characters long, then you're set.

[+] GrandMasterBirt|15 years ago|reply

That's not true and is horsecrap.

a) check length of password. Length < 6 = fail. Length < 10 = complexity requirement. Length < 15 = less complexity requirement. Length > 15 = secure.

b) ALWAYS do a dictionary lookup on the password. A dic lookup vs a dic attack is cheap. Hell the dic can be stored in memory in a hash map... even with misspellings, we don't even need to be 100% accurate either. Once you have that, basically check complexity: 1 word = no good. 2 words = ok, add a special character in there and a caps. 3 words = great. 3 words + special characters = awesome. etc.

Anyone can make one of these in a few hrs. and give error messages like:

"Your password is too short"

"your password is too simple for it's length, either make it longer or add special characters, numbers, and capitol letters"

"your password is made up of a single word or easily obtainable information about you based on information we have in our system, please change the words used or add more of them"

This means: (a) people understand why password is rejected and (b) people have choices and (c) increases probability that people won't forget their password.

Oh and any system that limits passwords to < 50 characters can fo fk themselves. Too bad ADP (paystub) has a policy of 6-12 characters. Thats right limitations, and I have to use that crap. Is a 4000 byte database column that expansive nowdays?

Furthermore... Changing password every 1-3-6 months is also bad. WHY? Simple, very very simple: 99% of the time you will have one of the following situations:

- The same password is used + an extra 1 or 2 characters.

- The same password is used + different capitalizations.

- A different password is used + post-it note.

- The user will forget their password immediately.

The TRUE answer to the reason why it policies are the way they are is because people don't want to think. "Tom did it this way, it must be secure." "I don't really understand it, so whatever, its the standard, it must work."

To make passwords secure they need to enact the policies I stated + a second factor for authentication. If you log in from an "unknown" location (a new one) just send them an email, or sms, or something that gives some key to enter to authorize. This will work in letting people know there is a breach, change pw on need, don't make ppl change passwords nonestop. Lots of advantages.

Wow writing this I feel like we need a startup that provides this exact solution. One question for the fellas who know more... how hard is it to spoof someone's ip without them knowing (internal network or external)

[+] merloen|15 years ago|reply

In my 25M word corpus, "this is fun" occurs 23 times. There are only 94,000 trigrams that occur more frequently.

Therefore, you should be pessimistic, and consider the password "this is fun" less safe than passwords in the shape [a-zA-Z]{3}, like "tsP", of which there are 140608.

Assume attackers know the algorithm (e.g. three common words, one 7-letter word in l33tspeak, a 6-letter string of random ascii characters) but not the parameters.

[+] jasonwatkinspdx|15 years ago|reply

The most useful thing we can do as web developers:

- support very long passwords, so that users can use pass-phrases if they like.

- use bcrypt or the like for storage

- do not create easily cracked side channels, like a fixed set of "security questions" for forgotten passwords

[+] jerfelix|15 years ago|reply

Forget "security questions" altogether. We've gone to a set of "insecurity questions". It makes people smile when they've forgotten their password.

    "Do these jeans make me look fat?
    "Why is everyone staring at me?"
    "Is there something in my teeth?"
    "Are they talking about me?"

... yes I jest, but that would be cute (and serve no practical purpose).

[+] teaspoon|15 years ago|reply

You should also make your sites amenable to password managers that autofill and submit login forms.

Bank of America, for example, foils 1Password by spreading login across two forms and using a custom, non-machine-discoverable "submit" button in one of them.

[+] onedognight|15 years ago|reply

"this is fun" has structure and I suspect is an easy password to guess from the pool of all three word passwords. Just like using a dictionary is better than brute force, trying common words that usually go together when guessing three word passwords is much better than trying all three word passwords. If Google were to write a "word" password cracker using their data trove, I suspect "this is fun" would go down early. Likewise putting spaces between words would be to the Google cracker like adding a number on the end of a dictionary word is to a standard cracker.

[+] LiggityLew|15 years ago|reply

What about moving away from semantics (easy to remember words) and change to patterns on the keyboard? That's how I handle my most secure passwords. Patterns on the keyboard using all the keys and combinations of shift create passwords that are easier to remember than a random length string, and can become quite long (>8 chars).

[+] ig1|15 years ago|reply

No it's not. If I was brute forcing a password these days, I'd use the google ngram database, and "this is fun" and pretty much any other memorable phrase would fall pretty quickly.

[+] glenjamin|15 years ago|reply

Would you do that before or after your dictionary attack? Or brute force character ngrams? Or pure brute random chars? There's a fair few responses pointing out that it's not a hugely uncommon ngram, but the known search space in this case is just "a password field", the application/organisation's rules might tell us min length, or if we can discount no numbers/symbols.

Time between events should be limited by either the app's login function or a suitably expensive hashing function (whose algorithm will have to be known in the case of a DB dump)

[+] dvdhsu|15 years ago|reply

I refuse to believe that there are no tools that can dictionary attack sentences.

[+] colanderman|15 years ago|reply

I'm willing to bet that the set of 8-character full-ASCII passwords is comparable in cardinality to the set of 8-word sentences, even accounting for Markov information density.

[+] ernestipark|15 years ago|reply

Using a grammar and a dictionary of common words I can see attacks on 3 or 4 word passwords being somewhat feasible although I should do some back of the envelope calculations.

[+] stretchwithme|15 years ago|reply

hmm, perhaps I'm missing something, but shouldn't systems just not allow you to attempt to login so much and so frequently?

I guess these systems do get hammered by so many improper attempts and you'd risk blocking the actual account owner. But if a billion attempts are made from the same ip in an hour, shouldn't that be considered suspicious?

Personally, I like what Google's doing with the two-step verification. That's probably where security should be going.

[+] Ratufa|15 years ago|reply

Good "dictionaries" for doing on-line brute-force attacks don't just contain words, they contain likely passwords. Guidelines for choosing good passwords should point this out. For example, something like "J4fS!2" is a much much more secure password in terms of protection from on-line attacks than "letmein" or "chang3m3" or "tryandguessthis" or "password123" or "root!@#" or "b4ckm3upsc077y". All of those passwords are actual passwords taken from the list used by an SSH brute-force password cracker.

Because people aren't random when they choose words to remember (e.g. "beavisandbuthead" is also on that list), a better set of password-choosing directions would provide instructions one how to add some additional (pseudo-)randomness to passwords that are being created. The classic "pick a phrase, take the first letters + punctuation" method is one way to do that ("pap,ttfl+p" is a somewhat strong password), and it's not hard to think of other password generation schemes that also create strong passwords.

[+] mcorrientes|15 years ago|reply

Using multiple words or even sentences as password (as described in the article) doesn't even work always, there are too many websites or application which have a password length limit.

I recommend to use a password manager, KeePass is quite good.

Good password manager should be able to easily generate a new strong and complex password every time.

Remembering only one password and getting rid of the laziness of choosing always the same password is another advantage too.

Even if a website that stored your password in clear text and someone hacks the website, you shouldn't have to worry about other applications or services you may have used with the same password.

My personal rule is to choose unique strong passwords (alphanumeric and symbols) with at least 9 chars.

Brute forcing a password with 8 chars was with my 5870 no big deal at all, but cracking a password with 9 chars is too expensive (ec2 gpu) or takes to long for the usual hacker.

If someone really brute forces my password, with gpu and a cluster support, damn, than he really deserve it.

But that's just my two cents.

[+] EGreg|15 years ago|reply

He forgets one of the easiest ways of getting people's accounts:

http://xkcd.com/792/

[+] presto8|15 years ago|reply

The title is a bit misleading. The passphrase "this is fun" may be 10 time harder to brute-force than "J4fS!2", but both are hard enough that nobody would bother trying to brute force attack them. So they both are equally acceptable. I personally would rather type "J4fS!2", and here's why:

We use PGP Whole Disk Encryption at my company. The passphrase strength requirements are quite strict. It took me about two dozen attempts before I found a password that it would accept. The password was something along the lines of what the article is proposing, five short English words arranged in a sentence (about 25 characters long). This was acceptable to PGP because the software prefers longer passphrases with less entropy per character over short passphrases.

The problem is that it's quite hard to type this long passphrase in when all you can see on the screen is stars or dots. The longer a passphrase is, the higher chance there is of introducing a typo. A shorter passphrase, mixed case and with symbols, is, at least for a programmer, easier to type, especially with muscle memory.

In the case of PGP Whole Disk Encryption, they obviously realized this since you can press the tab key to enable showing the password in plaintext as you type it. I always do this because it increases the success rate of my password acceptance quite a bit.

On an unrelated note, it seems that a far bigger security risk on the Internet is the use of the same password on multiple web sites. If you use "this is cool" on ten different sites, then you are opening yourself up to serious vulnerability if one of the sites is compromised. Using a hash of a common password with the domain name provides a lot more security, but the simple implementations available today produce passwords that are short with mixed case and symbols versus long strings of words. But since the only sane way to use this approach is with a password manager, extension, or bookmarklet anyway, this doesn't seem to be a major limitation.

But having to create and remember short three to five word passphrases for dozens of web sites would be a daunting challenge!

[+] swaits|15 years ago|reply

This person is just.. confused, to put it nicely.

My password system is detailed here: http://news.ycombinator.com/item?id=2431480

It's secure, passwords are never stored, and it's not based on any false premises.

170 comments