First off, MD5 is said to be insecure. Sure, but only for some purposes. This very recently appeared on the security stackexchange website[1]: Why does some popular software still use MD5?
In the SHA-1 chapter he mentions that you can truncate SHA-2 outputs if you are concerned about storage space. No! It may be entirely safe or even desirable in the case of SHA-2, but how can you be sure? It might very well be designed differently. Never invent your own crypto. Another recent thread on the security stackexchange mentioned[2]: "The fact that you need to ask this question is the answer itself - you do not know what is wrong with stacking these primitives, and therefore cannot possibly know what benefits or weaknesses there are."
Then the article goes on about SHA-2: "Sha-256 should be chosen in most cases, including hashing your user passwords" Oh God no. Use bcrypt, scrypt, PBKDF2--I don't care. Don't invent your own crypto. Hashes work great for storing passwords, but hashing algorithms aren't invented for password storage! Hashing algos are supposed to be fast (see the SHA-3 competition, they were looking for algorithms that were especially fast on hardware implementations as complement for SHA-2).
And then lastly, testing only a single implementation (the one in .NET) is hardly a good comparison. It doesn't matter much how much hashes you can do per millisecond, especially since secure password hashing algorithms are supposed to run slowly.
Thanks lucb1e for pointing it out -- I edited the article to reflect this and I apologize for (temporarily) spreading the misconception that SHA-like functions are good for hashing passwords.
The intention for my article was to analyze the different hash functions, and benchmark them in .NET, and not to discuss anything related to passwords (which would require a blog post of its own, and it's a topic I hadn't extensively researched). I got a little carried away mentioning passwords.
I thought a hash being relatively slower makes it better.
Because then it would take more time to brute force. I mean if hashes were near instant then even without any theocratical collision that wouldn't be good, that hash would fall faster than a slower one!
> I thought a hash being relatively slower makes it better.
Depends on the workload, and the article does not explain anything there so it's completely useless.
For "slow hash" workloads such as password hashing you want a slow hashing function, but none of the three hashes listed is acceptable for password hashing anyway so their comparison is not really relevant (they can be part of a password hashing strategy e.g. as the crypto hash function of PBKDF2 through HMAC[0], but in any case you'll want to size your iteration based on the acceptable time spent hashing in the normal workflow).
For things like checksumming, you do care about checking throughput and you care that payloads are hard to craft (so it's not feasible to alter the file but keep the checksum unchanged), depending how important the checksumming is you may accept tradeoffs e.g. if you want to be resistant against accidental corruption CRC32 might be enough (very fast but easy to fake), if you want to protect users against malicious attacks (e.g. distro package repository) you'll want a good cryptographic hash as validation is more important than raw throughput. MD5 can work there, maybe more on the CRC32 side or as an early validator (e.g. with a fallback hash so MD5 is used to weed out some stuff fast and potential false positives are eradicated through a second hash function)
For hash maps, you want speed and spread across buckets.
To provide flexibility, core crypto hashing function try for two goals: 1. throughput speed and 2. limiting collisions
[0] if you want to hash user passwords — and you should - there are currently three known good choices, all variable workloads (means you can increase the number of "rounds" without decreasing the entropy, making the passwords as hard to crack as you can afford): PBKDF2, Bcrypt and Scrypt. The first 2 are CPU-hard (the variable workload deals with the amount of computation only), the last one is also memory-hard (it also takes a variable amount of memory — configured when hashing — making it much harder to parallelize on GPU or ASIC). The point of variable workloads is that as CPUs improve you can add more rounds to the process to keep up (for instance the original PBKDF2 RFC recommended "1000 iterations" in 2000, the current standard is generally 10000~20000 and OWASP recommends 64000. Generally speaking you want to give it as much time as you can — on your production hardware — without user disruption, shooting for at least 250ms is a good idea in most systems which'll only rarely need to "login" users and apply this delay)
Not at all true. There are many applications where a fast hash function is desirable and speed does not really impact its security. File integrity, for example, is less of a burden to check with a fast hash and the attacks you'd mainly be worried about are some form of collision attack or second pre-image. For secure hash functions, the number of hash operations required to carry out the attack is so huge that the speed of the actual hash operation doesn't make much of a difference.
Where people always seem to get this idea from is password storage. In that scenario, the attack is an attacker trying to brute force guess a secret input from the hashed output. Since passwords don't tend to be very good, the number of hash operations required in that attack is low enough that suddenly the speed of the hash function DOES impact how feasible the attack is.
The solution to this problem isn't that all hash functions need to be slower, it's that people should stop using fast hash functions for password storage. Fortunately there are a number of ways to turn a fast hash function into a slow password storage function. Or you can use a function specially designed for password storage like bcrypt or scrypt.
Hash functions are generally designed to be fast. For the very concern you mentioned some people prefer using http://en.wikipedia.org/wiki/Bcrypt to store the password, which on the other hand is designed to be slow.
You have this completely backwards. MD5 is safe (relatively speaking) as the hash part of a signature as there are no known viable pre-image attacks against it at this point. For password storage, it's broken simply because it's fast, as has been pointed out many, many times.
Nope; doing a security demonstration recently, given a list of ~100 md5'ed passwords I'd brute-forced half of them in 15 minutes, with a lowest-end GPU doing the work. Salting would make it an order of magnitude harder, but that's still "not very hard at all".
[+] [-] lucb1e|13 years ago|reply
First off, MD5 is said to be insecure. Sure, but only for some purposes. This very recently appeared on the security stackexchange website[1]: Why does some popular software still use MD5?
In the SHA-1 chapter he mentions that you can truncate SHA-2 outputs if you are concerned about storage space. No! It may be entirely safe or even desirable in the case of SHA-2, but how can you be sure? It might very well be designed differently. Never invent your own crypto. Another recent thread on the security stackexchange mentioned[2]: "The fact that you need to ask this question is the answer itself - you do not know what is wrong with stacking these primitives, and therefore cannot possibly know what benefits or weaknesses there are."
Then the article goes on about SHA-2: "Sha-256 should be chosen in most cases, including hashing your user passwords" Oh God no. Use bcrypt, scrypt, PBKDF2--I don't care. Don't invent your own crypto. Hashes work great for storing passwords, but hashing algorithms aren't invented for password storage! Hashing algos are supposed to be fast (see the SHA-3 competition, they were looking for algorithms that were especially fast on hardware implementations as complement for SHA-2).
And then lastly, testing only a single implementation (the one in .NET) is hardly a good comparison. It doesn't matter much how much hashes you can do per millisecond, especially since secure password hashing algorithms are supposed to run slowly.
All in all, have a look at this if you want to know the real answer: http://security.stackexchange.com/questions/211/how-to-secur...
[1] http://security.stackexchange.com/questions/33108/why-does-s...
[2] http://security.stackexchange.com/questions/33531/why-improv...
[+] [-] Kop|13 years ago|reply
The intention for my article was to analyze the different hash functions, and benchmark them in .NET, and not to discuss anything related to passwords (which would require a blog post of its own, and it's a topic I hadn't extensively researched). I got a little carried away mentioning passwords.
[+] [-] Systemic33|13 years ago|reply
Using a hash algorithm for password storage is just plain wrong. Use something designed for password encryption instead, such as BCrypt
[+] [-] wereHamster|13 years ago|reply
NO, NO, NO, NO. How many times does that need to be repeated. Don't use SHA to hash passwords!
[+] [-] aeon10|13 years ago|reply
Because then it would take more time to brute force. I mean if hashes were near instant then even without any theocratical collision that wouldn't be good, that hash would fall faster than a slower one!
[+] [-] masklinn|13 years ago|reply
Depends on the workload, and the article does not explain anything there so it's completely useless.
For "slow hash" workloads such as password hashing you want a slow hashing function, but none of the three hashes listed is acceptable for password hashing anyway so their comparison is not really relevant (they can be part of a password hashing strategy e.g. as the crypto hash function of PBKDF2 through HMAC[0], but in any case you'll want to size your iteration based on the acceptable time spent hashing in the normal workflow).
For things like checksumming, you do care about checking throughput and you care that payloads are hard to craft (so it's not feasible to alter the file but keep the checksum unchanged), depending how important the checksumming is you may accept tradeoffs e.g. if you want to be resistant against accidental corruption CRC32 might be enough (very fast but easy to fake), if you want to protect users against malicious attacks (e.g. distro package repository) you'll want a good cryptographic hash as validation is more important than raw throughput. MD5 can work there, maybe more on the CRC32 side or as an early validator (e.g. with a fallback hash so MD5 is used to weed out some stuff fast and potential false positives are eradicated through a second hash function)
For hash maps, you want speed and spread across buckets.
To provide flexibility, core crypto hashing function try for two goals: 1. throughput speed and 2. limiting collisions
[0] if you want to hash user passwords — and you should - there are currently three known good choices, all variable workloads (means you can increase the number of "rounds" without decreasing the entropy, making the passwords as hard to crack as you can afford): PBKDF2, Bcrypt and Scrypt. The first 2 are CPU-hard (the variable workload deals with the amount of computation only), the last one is also memory-hard (it also takes a variable amount of memory — configured when hashing — making it much harder to parallelize on GPU or ASIC). The point of variable workloads is that as CPUs improve you can add more rounds to the process to keep up (for instance the original PBKDF2 RFC recommended "1000 iterations" in 2000, the current standard is generally 10000~20000 and OWASP recommends 64000. Generally speaking you want to give it as much time as you can — on your production hardware — without user disruption, shooting for at least 250ms is a good idea in most systems which'll only rarely need to "login" users and apply this delay)
[+] [-] rainsford|13 years ago|reply
Where people always seem to get this idea from is password storage. In that scenario, the attack is an attacker trying to brute force guess a secret input from the hashed output. Since passwords don't tend to be very good, the number of hash operations required in that attack is low enough that suddenly the speed of the hash function DOES impact how feasible the attack is.
The solution to this problem isn't that all hash functions need to be slower, it's that people should stop using fast hash functions for password storage. Fortunately there are a number of ways to turn a fast hash function into a slow password storage function. Or you can use a function specially designed for password storage like bcrypt or scrypt.
[+] [-] Kop|13 years ago|reply
[+] [-] qompiler|13 years ago|reply
[+] [-] unknown|13 years ago|reply
[deleted]
[+] [-] masklinn|13 years ago|reply
MD5 has never ever been secure for password storage, and now less than ever.
This also applies to all SHA by the way, a cryptographic hash is not acceptable for password hashing.
[+] [-] daeken|13 years ago|reply
[+] [-] Shish2k|13 years ago|reply
Nope; doing a security demonstration recently, given a list of ~100 md5'ed passwords I'd brute-forced half of them in 15 minutes, with a lowest-end GPU doing the work. Salting would make it an order of magnitude harder, but that's still "not very hard at all".