I know I won't be able to easily convince anyone of this, but I thought I'd mention...
Colin Percival's "scrypt" password hash is:
1) production-ready (and has been for a long time)
2) superior to bcrypt, as it is designed to be expensive in both CPU and memory (hence, scrypt is "memory-hard", whereas bcrypt is not)
I don't have time to go into further detail. I encourage you to check it out. It's quite simply "the future of password hashing". (Bcrypt will be defeated by natural advances in multi-core hardware; scrypt won't ever be.)
Passwords hashed with scrypt with sufficiently-high strength values (there are 3 tweakable input numbers) are fundamentally impervious to being cracked. I use the word "fundamental" in the literal sense, here; even if you had the resources of a large country, you would not be able to design any hardware (whether it be GPU hardware, custom-designed hardware, or otherwise) which could crack these hashes. Ever. (For sufficiently-small definitions of "ever". At the very least "within your lifetime"; probably far longer.)
The usual concern with scrypt is that it has less research behind it. The bcrypt hash function has gone through over twelve years of attempts without being broken in the cryptographic sense. The scrypt hash has only around three years of the same.
I would expect that within 5-10 years, scrypt will be the normal suggestion. It really is much better on the fundamentals, so much so that some experts recommend it even in spite of its relative newness. For the moment, though, there are arguments to be made either way.
One thing to be careful is that you don't open yourself to a trival DOS. The default scrypt command line program will churn CPU for a full second when encrypting a file. If you use similar settings, it will not take an attacker many connections to reduce even a large cluster to tears.
There are no known good implementation of scrypt in php (http://stackoverflow.com/questions/10149554/are-there-any-ph...) So I don't know how "production ready" that is considering php is the most popular platform for the web as much as I hate the language.
Fyi, scrypt is in the Debian testing and unstable repos [1], and Ubuntu's universe repo [2]. Not sure about the RPM ecosystem, but I'm sure it's available there too. And there are a bunch of bindings and libs in various languages on github as well [3]. And there is always the source [4].
I personally agree that "Use bcrypt." should become "Use scrypt." soon. My main gripe is that there is far less library support for it, at least for now.
The article says to give each password its own unique salt and then store the salt (as well as the salted hash) in the database.
This seems like a bad idea to me. If a hacker gets access to the salted passwords, in this case he'll probably figure out how to get access to the salts too.
I figure if the salt is stored in the code (or a config file...) rather than the database itself, at least it's two different hacks to get (1) the salted hashes, and (2) the salt.
1. Avoid precomputed hash attacks ("rainbow table") where the attacker has a big list of hashes:password, and can just walk the table of (leaked) password hashes to get the cleartext. A global salt is sufficient for that, and where it's stored does not matter (can be a config file or a config table or whatever)
2. Avoid the attacker being able to brute-force the whole collection at once, there each password needs its own salt: the attacker needs a pair of (salt, hash) to be able to brute-force each and every password, it can't just compute a million (salted) hashes and cross-check all the table, it has to do so for each and every password it wants to crack. This requires a unique salt per password/hash, and the salt can just be stored with the hash.
Salts are not secrets, they just exist to make the hashing of a given user's password unique. They are generally returned as part of the hash function's result (alongside the number of rounds, so the result has the shape (cost, salt, hash)), it's understood and expected that the attacker knows them: it does not matter to their purpose.
No it's not a bad idea and essentially it's what BCrypt does.
Think about it. If you use the same salt for all passwords then I can easily create a rainbow table consisting of "keyword" + salt hashes and doing so I can crack multiple passwords.
However if there is a different salt for each password then this type of attack becomes more difficult as I have to essentially do the same amount of work for only a single password.
Finally, we need to store the salt in the database because it's needed in order to recreate the hash using the user's password for authentication.
You are. The salt is just to stop the use of rainbow tables, which are pre-generated maps between plaintext passwords and their hashes.
Anyway, you want to store a new salt (not just a system-wide 'this is my salt' salt) for each stored password anyway, so you will need to store that data somewhere. You could obfuscate a little by storing the salts elsewhere, but it seems a little extreme.
Yup. The main purpose of salt (aside from increasing entropy by increasing the length and complexity of the hashed value) is that it prevents rainbow table attacks where an attacker pre-computes (or downloads precomputed) hashes for common passwords, dictionary words, and brute-force style variations of same.
A hash unique to the site would require the attacker to create a site-specific rainbow table, but once created it can be used for all passwords. Having a unique hash per password means that the attacker would have to generate unique password tables per user, which for a suitable salt & algo is impractical, even if (as they normally are) the salts are stored with the passwords.
I was always under the impression that the main point of salting a password before hashing (unique or not, stored in cleartext or otherwise) was to prevent the hash from being referenced against a rainbow table of precomputed hashes. Obviously this is in the case of using md5/sha1 for your hashing.
As an example, find the md5 of a dictionary word and google it, its original value is bound to be in the first or second result. Now find the md5 of that word and a random string (aka a salt)... google and there shouldn't be any results. and even if there was due to a collision, that password wouldn't work, because it would be resalted prior to the comparison happening.
The individual salts are added only to prevent an attack with a list of pre-calculated hashes.
You could also store another common salt ("pepper"?) in the code.
On a more esoteric note: If you are looking to resist quantum algorithms attacks, there are post-quantum algorithms for that[1] (they are computed on normal machines, but the problems behind the crypstosystems are hard to solve even for quantum computers).
If you're storing passwords as MD5/SHA hashes, how difficult is it be to switch over to bcrypt? I've never had to do this, but I would imagine it would be somewhat trivial. With all of the password leaks that have happened over the past few years, I'd imagine a good amount of developers are aware that storing passwords as MD5/SHA hashes is somewhat risky, so I can't understand why big websites (LinkedIn) are still doing it.
I liked the article and it was a nice little afternoon read, but the whole thing could have been condensed to "use bcrypt for passwords".
I'm not really sure where to stand on this. On one hand, we have PLENTY of security articles stating the same thing (bcrypt, bcrypt, and just in case you've forgotten... bcrypt), which leads to an observed over saturation of the same subject matter. On the other hand, we have a huge company like LinkedIn that doesn't have the presence of mind to use something other than vanilla SHA-1. Maybe there's just too much lazyness/ stupidity in the world to require a constant barrage of the same security articles every week.
Disagree. The point of this article as I understood it was not to provide short advice, but to provide a comprehensive explanation of the problem and why the 3 listed solutions are good ones.
People often ask "why use bcrypt", and the response is that they should google it. If you look through the first page of google results for [why use bcrypt], though, none have a good discussion of the reasonable alternatives, or when you might want to use one or the other.
Coda Hale's post has a pretty good explanation of why bcrypt is good, but I personally find this to be more in-depth.
I imagine things like this is a result of the account management part of LinkedIn being built years ago when we thought plain-jane sha1 was ok to use, and then they just never got around to using bcrypt.
Ironically, the algorithm to upgrade to bcrypt is simple. Add a flag to the account table if they've upgraded or not. Next time the user signs in successfully, re-hash their password with bcrypt, toggle the flag, and update the password_hash value in the database.
I agree there are a ton of articles saying "Use bcrypt." After Coda's post (http://codahale.com/how-to-safely-store-a-password/) it's almost become a meme. I don't, however, think that the people who say "Use bcrypt!" tend to explain why they say that.
I think the reason that this happens so often is that regular developers just don't care. But that's because they don't know why they should care. Given a proper explanation (and an attention span longer than "Squirrel!"), any reasonable developers would (at least, should) care.
There are a multitude of sophisticated third-party solutions to authentication. Facebook, Twitter, and Google all offer competent solutions. Don't like those? Use BrowserID.
Integrating any of these is actually quite a bit easier than rolling your own solution. It reduces hack risk, provides a better experience for your customers (what was my password again?), and almost certainly will be more reliable than your website.
OpenID and OAuth really did a lot, but there's just nothing called "don't use passwords." Fingerprint readers suck. Anything biometric that doesn't suck costs too much, and 99% of people don't have them. A good KDF is not bad in comparison to a centralized authentication server considering other factors.
Someone, somewhere will be storing user passwords/digests for the foreseeable future. And they will do it incorrectly.
This naively assumes that your entire userbase uses those services and would like to attribute their Google (et al) account with your service. This may not always be the case.
Someone has to store the passwords, it would be good if there was a way you could be assured your data at rest was safe.
I used to hear some controversy with regards to "stretching." The argument back in the day was, "it's partially security through obscurity, but the danger is that there isn't research to prove that a hash of a hash is cryptographically strong."
So is there research that proves that hashing a hash of a hash of a hash (x100000) doesn't result in a smaller range of values than a single hash for SHA algorithms? Is there no such convergence?
Stretching isn't "security through obscurity". It's "security through increasing the attacker's cost by a huge amount while increasing your own cost by a minimal amount".
But don't use stretched SHA1. Use bcrypt or scrypt or PBKDF2, all of which explicitly address this particular concern.
An even better way of securely storing your passwords would be to mix them around on entry to your bcrypt hash function in a unique way that makes it impossible to brute force your leaked password hashes without having access to the code that did them.
So something like a HMAC digest generated using a pepper stored in the source code/binary or on disk before passing it to bcrypt/scrypt? :)
This only really protects against SQL injection attacks, though/when there is actually a separation between where you store the bcrypt digests and where you store the pepper. (Granted, there are a lot of SQL injection attacks.)
If you happen to have a web app that stores passwords in clear text or SHA-1 hashed, all is not lost. You can apply further secure hashes to the existing value stored on db and update your authentication validator.
Excellent reading - I spent countless hours researching all this stuff for a project few years back. After my research I came up with very similar protocol than SRP but I find that SRP is a nice POC for both protocols.
One thing I didn't find solution for is keyloggers and other similar attacks. If you look at the whole securing your service as a whole, you have to acknowledge the risk of keyloggers also. Now with Flame, Stuxnet and all the other nice things still in the shadows keyloggers can suddenly become also a risk in a large scale.
I was just testing with www.Leakedin.com for a possible breakup of my LinkedIn password. I gave my password and said it is leaked. But when I placed a random value, like dsfsfgfdsgsd it said hoorah!, not leaked.
So, the onus is on the users, who don't want their passwords to be leaked, rather than depend upon someone to keep it safe.
OK, so I get the message. Use bcrypt. Don't worry, that's what I'll do in production.
On the other hand, if it's so hard to roll your own, can somebody point out the security flaws in the given Python function? Seems pretty straightforward to my untrained eye.
Why does everyone need to be an expert in password generation and storing? Would a password as a service would be feasible or even authentication as a service? (And I do not mean FB or Twitter auth)
it is open source, and supported by all platforms i use (windows,osx,android,ios) and the interface is pretty well designed.
Once local database is open, simply doing ctrl+c on any of the sites copies the password to clipboard for a very limited time.
this is still a major pain, especially since you need to protect the safe with a long password and this is particularly painful to type on mobile devices.
[+] [-] sillysaurus|14 years ago|reply
Colin Percival's "scrypt" password hash is:
1) production-ready (and has been for a long time)
2) superior to bcrypt, as it is designed to be expensive in both CPU and memory (hence, scrypt is "memory-hard", whereas bcrypt is not)
I don't have time to go into further detail. I encourage you to check it out. It's quite simply "the future of password hashing". (Bcrypt will be defeated by natural advances in multi-core hardware; scrypt won't ever be.)
Passwords hashed with scrypt with sufficiently-high strength values (there are 3 tweakable input numbers) are fundamentally impervious to being cracked. I use the word "fundamental" in the literal sense, here; even if you had the resources of a large country, you would not be able to design any hardware (whether it be GPU hardware, custom-designed hardware, or otherwise) which could crack these hashes. Ever. (For sufficiently-small definitions of "ever". At the very least "within your lifetime"; probably far longer.)
[+] [-] amalcon|14 years ago|reply
I would expect that within 5-10 years, scrypt will be the normal suggestion. It really is much better on the fundamentals, so much so that some experts recommend it even in spite of its relative newness. For the moment, though, there are arguments to be made either way.
[+] [-] tedunangst|14 years ago|reply
[+] [-] yichi|14 years ago|reply
[+] [-] SkyMarshal|14 years ago|reply
1. http://packages.debian.org/search?keywords=scrypt
2. http://packages.ubuntu.com/search?keywords=scrypt
3. https://github.com/search?q=scrypt&type=Repositories
4. http://www.tarsnap.com/scrypt.html
[+] [-] pmylund|14 years ago|reply
I personally agree that "Use bcrypt." should become "Use scrypt." soon. My main gripe is that there is far less library support for it, at least for now.
[+] [-] jontro|14 years ago|reply
When looking at jBCrypt for instance, they're only at version 0.3 with no updates since 2010, makes me really nervous of using it.
[+] [-] AgentConundrum|14 years ago|reply
[+] [-] nshankar|14 years ago|reply
[+] [-] pud|14 years ago|reply
This seems like a bad idea to me. If a hacker gets access to the salted passwords, in this case he'll probably figure out how to get access to the salts too.
I figure if the salt is stored in the code (or a config file...) rather than the database itself, at least it's two different hacks to get (1) the salted hashes, and (2) the salt.
Am I misunderstanding?
[+] [-] masklinn|14 years ago|reply
Aye. There are two points to the salt:
1. Avoid precomputed hash attacks ("rainbow table") where the attacker has a big list of hashes:password, and can just walk the table of (leaked) password hashes to get the cleartext. A global salt is sufficient for that, and where it's stored does not matter (can be a config file or a config table or whatever)
2. Avoid the attacker being able to brute-force the whole collection at once, there each password needs its own salt: the attacker needs a pair of (salt, hash) to be able to brute-force each and every password, it can't just compute a million (salted) hashes and cross-check all the table, it has to do so for each and every password it wants to crack. This requires a unique salt per password/hash, and the salt can just be stored with the hash.
Salts are not secrets, they just exist to make the hashing of a given user's password unique. They are generally returned as part of the hash function's result (alongside the number of rounds, so the result has the shape (cost, salt, hash)), it's understood and expected that the attacker knows them: it does not matter to their purpose.
[+] [-] Dylan16807|14 years ago|reply
You might feel like adding a hidden component but it won't noticeably help security and it's not a salt.
[+] [-] ashconnor|14 years ago|reply
Think about it. If you use the same salt for all passwords then I can easily create a rainbow table consisting of "keyword" + salt hashes and doing so I can crack multiple passwords.
However if there is a different salt for each password then this type of attack becomes more difficult as I have to essentially do the same amount of work for only a single password.
Finally, we need to store the salt in the database because it's needed in order to recreate the hash using the user's password for authentication.
[+] [-] damncabbage|14 years ago|reply
Too often people assume one way or the other (and come up with their own hare-brained password encryption scheme, defending it from all-comers).
[+] [-] SCdF|14 years ago|reply
Anyway, you want to store a new salt (not just a system-wide 'this is my salt' salt) for each stored password anyway, so you will need to store that data somewhere. You could obfuscate a little by storing the salts elsewhere, but it seems a little extreme.
[+] [-] scoot|14 years ago|reply
A hash unique to the site would require the attacker to create a site-specific rainbow table, but once created it can be used for all passwords. Having a unique hash per password means that the attacker would have to generate unique password tables per user, which for a suitable salt & algo is impractical, even if (as they normally are) the salts are stored with the passwords.
[+] [-] navitronic|14 years ago|reply
As an example, find the md5 of a dictionary word and google it, its original value is bound to be in the first or second result. Now find the md5 of that word and a random string (aka a salt)... google and there shouldn't be any results. and even if there was due to a collision, that password wouldn't work, because it would be resalted prior to the comparison happening.
[+] [-] jp_sc|14 years ago|reply
[+] [-] unknown|14 years ago|reply
[deleted]
[+] [-] ZoFreX|14 years ago|reply
[+] [-] zzzeek|14 years ago|reply
Hi Phil -
Yes. Salts are to defeat things like this: http://en.wikipedia.org/wiki/Rainbow_table
[+] [-] nshankar|14 years ago|reply
[+] [-] DanielRibeiro|14 years ago|reply
On a more esoteric note: If you are looking to resist quantum algorithms attacks, there are post-quantum algorithms for that[1] (they are computed on normal machines, but the problems behind the crypstosystems are hard to solve even for quantum computers).
[1] http://crypto.stackexchange.com/questions/494/what-is-the-po...
[+] [-] jharding|14 years ago|reply
[+] [-] fein|14 years ago|reply
I'm not really sure where to stand on this. On one hand, we have PLENTY of security articles stating the same thing (bcrypt, bcrypt, and just in case you've forgotten... bcrypt), which leads to an observed over saturation of the same subject matter. On the other hand, we have a huge company like LinkedIn that doesn't have the presence of mind to use something other than vanilla SHA-1. Maybe there's just too much lazyness/ stupidity in the world to require a constant barrage of the same security articles every week.
[+] [-] nsanch|14 years ago|reply
People often ask "why use bcrypt", and the response is that they should google it. If you look through the first page of google results for [why use bcrypt], though, none have a good discussion of the reasonable alternatives, or when you might want to use one or the other.
Coda Hale's post has a pretty good explanation of why bcrypt is good, but I personally find this to be more in-depth.
[+] [-] leftnode|14 years ago|reply
Ironically, the algorithm to upgrade to bcrypt is simple. Add a flag to the account table if they've upgraded or not. Next time the user signs in successfully, re-hash their password with bcrypt, toggle the flag, and update the password_hash value in the database.
[+] [-] pmylund|14 years ago|reply
I think the reason that this happens so often is that regular developers just don't care. But that's because they don't know why they should care. Given a proper explanation (and an attention span longer than "Squirrel!"), any reasonable developers would (at least, should) care.
[+] [-] stickfigure|14 years ago|reply
There are a multitude of sophisticated third-party solutions to authentication. Facebook, Twitter, and Google all offer competent solutions. Don't like those? Use BrowserID.
Integrating any of these is actually quite a bit easier than rolling your own solution. It reduces hack risk, provides a better experience for your customers (what was my password again?), and almost certainly will be more reliable than your website.
[+] [-] pmylund|14 years ago|reply
Someone, somewhere will be storing user passwords/digests for the foreseeable future. And they will do it incorrectly.
[+] [-] dekz|14 years ago|reply
Someone has to store the passwords, it would be good if there was a way you could be assured your data at rest was safe.
[+] [-] peeters|14 years ago|reply
So is there research that proves that hashing a hash of a hash of a hash (x100000) doesn't result in a smaller range of values than a single hash for SHA algorithms? Is there no such convergence?
[+] [-] tptacek|14 years ago|reply
But don't use stretched SHA1. Use bcrypt or scrypt or PBKDF2, all of which explicitly address this particular concern.
[+] [-] mparlane|14 years ago|reply
[+] [-] pmylund|14 years ago|reply
This only really protects against SQL injection attacks, though/when there is actually a separation between where you store the bcrypt digests and where you store the pepper. (Granted, there are a lot of SQL injection attacks.)
[+] [-] teyc|14 years ago|reply
[+] [-] heretohelp|14 years ago|reply
Or invalidate and send an email.
[+] [-] uzero|14 years ago|reply
One thing I didn't find solution for is keyloggers and other similar attacks. If you look at the whole securing your service as a whole, you have to acknowledge the risk of keyloggers also. Now with Flame, Stuxnet and all the other nice things still in the shadows keyloggers can suddenly become also a risk in a large scale.
[+] [-] nshankar|14 years ago|reply
[+] [-] trjordan|14 years ago|reply
On the other hand, if it's so hard to roll your own, can somebody point out the security flaws in the given Python function? Seems pretty straightforward to my untrained eye.
[+] [-] unknown|14 years ago|reply
[deleted]
[+] [-] rokhayakebe|14 years ago|reply
[+] [-] SonicSoul|14 years ago|reply
it is open source, and supported by all platforms i use (windows,osx,android,ios) and the interface is pretty well designed. Once local database is open, simply doing ctrl+c on any of the sites copies the password to clipboard for a very limited time.
this is still a major pain, especially since you need to protect the safe with a long password and this is particularly painful to type on mobile devices.
[+] [-] unknown|14 years ago|reply
[deleted]
[+] [-] billymillions|14 years ago|reply
return getDigest(password, salt) == digest
getDigest returns a tuple
[+] [-] pnayak|14 years ago|reply
[+] [-] Morg|14 years ago|reply
[deleted]
[+] [-] uselessuseof|14 years ago|reply
[deleted]