How To Safely Store A Password

[+] eneveu|14 years ago|reply

Many developers need to read and understand this. It is far from mainstream knowledge...

The other day, I saw the following post about password hashing in my RSS feed: http://isc.sans.org/diary.html?storyid=11110

- No mention of bcrypt (though the posts mentions key stretching using SHA1)

- "When selecting an algorithm to hash passwords, it is important to select carefully as it is difficult to change the algorithm later. You will have to ask users to change their password if you do as you no longer know what password they picked." --> seriously? you can update the password the next time they log in...

- "You could also add a secret, in addition to the salt. If the secret is not stored in the database, it would not be easily reachable via a SQL injection exploit (yes, you can use them to read files, but it requires sufficient privileges)." --> security through obscurity, nice

- "For the paranoid, you may want to do the hashing on the client side (javascript) . This way, the server never receives the plain text password. We do this here for the ISC website on our login form [2]." --> oh noes...

Note that the blog has ~15k subscribers according to Google Reader...

---------

I also launched a debate on StackOverflow the other day. A self-proclaimed "security expert" (he later edited his post to remove this part) was advising against using bcrypt, arguing that it would facilitate DOS attacks against the login page... He prefers security through obscurity, using a secret salt:

http://security.stackexchange.com/questions/4781/do-any-secu...

I thought it was a bad idea to leave this answer unchallenged, so I tried arguing with him. I was met with arguments of authority such as "Wow you are out of your element and could not be more misguided on this topic" or "you disagree because you don't understand. Show me an exploit you have written, then I'll pay attention to you.". Happily, some (more experienced) people got in on the debate. I hope this will help developers make an informed choice, if they stumble upon his answer...

(edit: list formatting)

[+] chalst|14 years ago|reply

What about Debian's reason for not using bcrypt, claiming the time it takes to hash is not a weak point in security of /etc/shadow?

http://groups.google.com/group/linux.debian.user/browse_thre...

[+] Fargren|14 years ago|reply

I know very little about security, but is there something wrong with using a hash in the client side and then using bcrypt on the server so that you never receive the plain text password?

[+] zobzu|14 years ago|reply

not storing the password in sql has nothing to do with obscurity. In fact, it's smart. You do not understand what "security through obscurity" means.

Following your train of though, we should display the salt and the hash publicly. That is extremely dumb.

Put your passwords in your post, otherwise you'll be doing security through obscurity! oh snap?

[+] andrewf|14 years ago|reply

Many developers need to read and understand this.

I note that the question was migrated from StackOverflow to security.stackexchange.com. After all, this is an issue that the security nerds care about. Why bother ordinary developers with it? </sarcasm>

[+] tzs|14 years ago|reply

I'd like to see sites offer the option of not using password-based authentication. Instead I'd like to see public key based authentication as an option.

Basically, the site would have a copy of my public key (say my GPG key or an ssh key), and to authenticate I prove that I have access to the corresponding private key.

[+] JoachimSchipper|14 years ago|reply

You've just described SSL with client certificates. Works perfectly well, is extremely secure, and has an extremely bad GUI in pretty much every browser ever. (It's somewhat difficult to use "on the road", but that's arguably a security feature.)

[+] InclinedPlane|14 years ago|reply

With smartphones this is increasingly feasible. Lots of sites already use smartphone based two factor authentication (similar to rsa keys). There's no reason why a challenge / response system couldn't be set up using smartphones. For example, a website gives you a string of numbers, you input those in your smartphone app and get the response which yiu then input back to the site, the site can't determine the response ahead of time but it can validate it.

[+] caf|14 years ago|reply

That's already possible with SSL sites - sign up for an account at www.startssl.com if you want to see a real-life example.

[+] barrydahlberg|14 years ago|reply

This is how we use github among other things. AFAIK there isn't standard support for this sort of thing in the browsers though.

[+] icebraining|14 years ago|reply

I'd prefer they'd use OpenID. It would be easier for them to implement, and it'd let you use PKI with providers like https://certifi.ca/

[+] mahyarm|14 years ago|reply

What do you do when you loose the keys? And the default behavior or enabling keys whenever your computer is open? One click bank account access?

[+] Khao|14 years ago|reply

I have read this article and read the Wikipedia entry on bcrypt and I still cannot understand something. In this article it states that : "As computers get faster you can increase the work factor and the hash will get slower.". How can you make the algorithm slower over time and still be able to validate user passwords that were stored before you changed the speed? Could anyone enlighten me on this?

[+] mshafrir|14 years ago|reply

I believe that the work factor is encoded with the hash, so bcrypt can identify which work factor to use.

[+] chops|14 years ago|reply

The easy way is to support the previous work factors and increase it next time the user logs in (since you'll have the password in plaintext in memory). Either way, bcrypt with a paltry work factor of 7 or 8 is orders of magnitude slower than md5 and sha. Jack that up to 12 or 13 and you're pretty much good to go for years, the only problem is the near 1-second processing time on semi-current hardware.

[+] unknown|14 years ago|reply

[deleted]

[+] peteretep|14 years ago|reply

Here is an example:

https://gist.github.com/1051238

[+] stephth|14 years ago|reply

Looks like Rails 3.1 is using bcrypt:

https://github.com/rails/rails/blob/master/activemodel/lib/a...

[+] bdclimber14|14 years ago|reply

So does Devise, one of the most popular authentication Ruby gems.

[+] nonane|14 years ago|reply

The article claims it takes bcrypt 0.3 seconds to hash a 4 char password on a laptop.

How does a server authenticate users in high volume with bcrypt? ~0.25 secs per auth request might warrant having a separate server just for authentication.

[+] tptacek|14 years ago|reply

Several of the largest sites in the world have scaled bcrypt. There are longer answers as to why this is not a big deal, but if it helps to kill the red herring that bcrypt might not scale: Twitter uses it.

[+] chops|14 years ago|reply

It depends on the work factor. On my dev server (a pretty old machine), with a work factor of 7, it's about 300 time slower than md5 (about 10ms per bcrypt hash). That's still plenty fast, and much more secure. Bump it to a work factor of 9, and I'm looking at about 1000 times slower (or getting close to 40 ms per hash).

Part of its beauty is you can adjust the work factor to match your hardware speed requirements.

If you really want ultra-secure, a work factor is 13 is getting pretty slow (about half a second per hash on my crappy machine), and yeah, that might justify an authentication server.

(note: my tests were not exhaustive)

[+] tedunangst|14 years ago|reply

Most of your traffic should be coming from people with logged in cookies. If you make people enter their password on every page, you won't have to worry about high volume. :)

Note that bcrypt is basically insensitive to length of password. 4 chars, 8 chars, 32 chars, all take the same amount of time. And it's tunable. You could go down to only 0.01s per hash and still be a million times slower than plain MD5.

[+] jat850|14 years ago|reply

I'm in the process of writing some code that is hitting these very scenarios and characteristics right now. The metrics we put on the API call showed a remarkable spike when we switched from plaintext (in development mode) to bcrypt hashes.

However, after logging in and establishing a session, there's really no impact to the user. We were content to trade the speed of lesser hashing algorithms for the security offered using bcrypt. The only noticeable difference is that our login process went from near-instantaneous to a marginal, but noticeable delay, on login.

(The same delay can be noted when setting new passwords.)

[+] roel_v|14 years ago|reply

I guess by the time there are several users per second just authenticating, the $200 for a GPU that can boost hash performance 100x won't be the problem any more.

[+] WA|14 years ago|reply

An idea would be to use a JavaScript implementation of bcrypt to let clients do the math themselves and just send the encrypted password via HTTPS. This has several advantages. One being that the server-side performance is reduced heavily. Another one is that a password never shows up on the server in plaintext ever.

Disadvantage: Clients need to have JavaScript enabled.

[+] firsttimeposter|14 years ago|reply

People keep posting this here. But I think http://www.tarsnap.com/scrypt.html should probably be considered the best way to do this today. Google is even using it in ChromeOS.

[+] yogsototh|14 years ago|reply

I am surprised nobody had already talked about scrypt yet. Here is an old hackernews entry about it.

http://news.ycombinator.com/item?id=601408

I would have loved to use scrypt, but there is only a C implementation. I would had loved to have at least a javascript one.

[+] 16s|14 years ago|reply

It's good to raise awareness of this issue. When more devs began using bcrypt or scrypt, offline password cracking will be much, much more difficult.

The only reason GPUs are cited as testing 600 million hashes a second is that the underlying hashes came from a Microsoft Windows Active Directory where they were simply MD4 encoded. That speed is not possible with bcrypt. Devs need to understand this.

Edit: Yes, that's MD4 not MD5. Microsoft Windows NT hashes are simply Unicode strings that are MD4'ed. This includes Windows 7 and Windows 2008 server.

[+] yuhong|14 years ago|reply

And note if you already have this hash you can use it to login directly anyway as most Windows network protocols take this hash directly. The real important thing IMO is NTLM challenge/responses based on the hash, which unfortunately is not much better. In case of NTLMv1/MS-CHAP it is three 56-bit DES operations on separate parts of the 128-bit hash (the third being only 2^16 so it is easy to precompute, as shown by asleap). NTLMv2's HMAC-MD5 is fast too.

[+] falcolas|14 years ago|reply

Honest question. Why would someone who handles more than a couple-dozen login attempts per second choose to use bcrypt? It would seem that the computational overhead of supporting bcrypt at scale would not make a lot of financial sense.

[+] pbreit|14 years ago|reply

Can someone help me understand what a password cracker does with a list of salted/hashed passwords? How do they know they've figured out the right plain text passwords without bouncing them against the authentication logic?

[+] rorymccune|14 years ago|reply

Typically attackers do this during an offline brute-force attack (where they have a copy of the hashed password, probably along with other user information from the database).

The attack is done by getting a list of plain text passwords and then running them through the same hashing algorithm as was used originally (adding the salt value if present). Once the attacker has done that they just compare the hashed strings. If they match then it'll be the same password.

Commonly the attacker would start with a dictionary of common passwords and submit them along with common variants (eg, password, Password password1 passw0rd).

If that's not successful they can move onto pure brute force (eg, a , ab, ac, etc)

[+] andrewcooke|14 years ago|reply

i suspect you don't know that the salt is stored with the password [edit: when using standard libraries like crypt and bcrypt - please don't invent your own scheme]. so when someone steals the password list they get the salt too.

the salt is not "secret" - it is stored in plain text for each password. it does not need to be secret to do its job (defend against rainbow tables).

[+] giaskaylee|14 years ago|reply

The real question should always be how you detect and handle these attacks. Allowing someone to attack your service for 12 years and eating up your resources in the meanwhile just sounds too passive a solution.

[+] Xk|14 years ago|reply

That's not the attack you worry about: instead, consider the case where someone somehow obtains the database and can do an offline attack on it. Be it a SQL injection or account compromise (or sheer negligence and publishing the database), once that happens you'd better handle passwords reasonably well.

If the only attack situation you're worried about is a online guessing attack, then there's no need to even hash passwords.

[+] tomjen3|14 years ago|reply

What always annoys me with the discussion of passwords is that everybody here focus on a technical solution that allows the user to continue to use insecure passwords.

That isn't the problem. Reuse is. And the best way around that is to not let the user select the password, just generate it server side. Technically this is easier to get right than some complex password generation scheme and the end result is properly better too.

[+] mleonhard|14 years ago|reply

Let's say I use bcrypt and want to increase the work factor every year. Is it possible to determine the work factor of an existing hash, or will I need to maintain this information alongside the hash? The java and python bcrypt APIs don't have any function that returns the work factor.

Can I increase the work factor of an existing hash, or must I wait until the user logs in and then use the plaintext password to generate a new hash?

Is there a bcrypt API that provides a hash comparison function that addresses timing attacks? The py-bcrypt example code uses the '==' operator to compare hash strings, leaking timing information:

    # Check that an unencrypted password matches one that has
    # previously been hashed
    if bcrypt.hashpw(password, hashed) == hashed:
            print "It matches"
    else:
            print "It does not match"

(from http://www.mindrot.org/projects/py-bcrypt/)

[+] mleonhard|14 years ago|reply

To answer my first question, the work factor can be extracted from the hash. It is stored as a 2-digit decimal number in the fifth and sixth bytes of the hash. See http://code.google.com/p/py-bcrypt/source/browse/bcrypt/bcry...

[+] peteretep|14 years ago|reply

I recently put together a commentary + code on how to upgrade passwords hashed insecurely in your DB without user intervention. This was in response to MtGox talking about "slowly migrating users", which presumably means upgrading passwords at the point of login:

https://gist.github.com/1051238

[+] carbocation|14 years ago|reply

If I'm not misunderstanding your purpose, why not just:

1) Take those md5 unsalted passwords and 2) bcrypt them

Then upgrade the users to non-pre-hashed bcrypt as they log in?

?

[+] bborud|14 years ago|reply

In security the most important thing you can teach people is to be aware when things are harder than they look so they'll take a bit of extra time educating themselves and talking to other people.

Learning how to separate good from bad advice is a skill that needs to be maintained. Also, in all software that is supposed to provide some form of security, one has to be prepared for the eventuality that it probably contains errors.

I used to read a lot of books on cryptography and the use of cryptography. I've forgotten most of it by today, and to be quite frank: the more I know, the less I want to write crypto software. There is something deeply unsatisifying about work where you know that what will in all likelihood trip you up is some trivial, stupid mistake.

It isn't hard because of the crypto itself. Sure, certain cryptographic libraries can be extremely awkward to use (which in itself is a security risk), but the problem usually comes from where you aren't looking for them.

I cringe a bit when people advertise software as being secure because it uses this or that encryption scheme. I also cringe when people claim that they "encrypt databases" and their systems are therefore secure -- because, for most usage scenarios, I can't think of any really secure way of doing this. Not without compromises anyway. And while I honestly know that I have just a rudimentary grasp on cryptography, I know a lot more about it than most people who make products that hinge upon correct application of crypto.

Just the other day I was trying to determine how many rounds I wanted to use in bcrypt for storing passwords for a given system. I think I spent most of the day pondering this question, writing benchmarks and reading up on what other people said on the topic. A couple of days later a friend of mine emailed me a code snippet that implemented the password hashing scheme of a commercial product they use that makes shameless claims about being secure. (I think he probably just looked at the hashed passwords and made a guess about the method they had used. I don't think he had a look at the original source code). If memory serves the product uses unsalted SHA1 hashing. In other words, the vendor didn't even bother thinking about the problem.

What scares me a bit is that even I thought "well, if they claim to have well thought-out password handling and they are in the business of selling security systems, I suppose they have probably given this a lot of thought" the first time I visited their website. After all, large companies give them millions. Right?

I wonder what other things they are doing equally badly.

[+] klon|14 years ago|reply

Does anyone know how Djangos built-in password hashing fares?

[+] unknown|14 years ago|reply

[deleted]

[+] mildweed|14 years ago|reply

Previous post: http://news.ycombinator.com/item?id=1998819

[+] _urga|14 years ago|reply

To see if I understand bcrypt correctly, would this pseudo code achieve a similar adaptable computational cost?

  function(uniqueSalt128Bit, password, rounds) {
    var hash = SHA1(uniqueSalt128Bit + password);
    var length = Math.pow(2, rounds);
    while (length--) hash = SHA1(hash);
    return hash;
  }

[+] antihero|14 years ago|reply

Or a stretched newer algorithm like Whirlpool. Or stretched SHA-256? What advantages does bcrypt have over them?

209 comments