How Dropbox securely stores your passwords

[+] Someone1234|9 years ago|reply

I cannot see any obvious weaknesses in this scheme.

It seems to address a known pain point in bcrypt (max length), implements a pepper in a secure way (which cannot inadvertently degrade security), and is otherwise doing things which are best practices (high work factor, per user salt, etc).

I know peppers remain controversial (some people claim they're pointless, and make a good argument). But ultimately nothing Dropbox is doing with peppers in this article makes your password easier to break, only harder.

I'd call this scheme 10/10.

[+] perfectfire|9 years ago|reply

Proof that they're not pointless: The adobe password leak. Other than the giant crossword puzzle[0] created by the password hints combined with their choice of ECB mode to encrypt the passwords that allowed people to infer blocks of passwords, I haven't been able to find any evidence that the encryption key was leaked or guessed. So, most of the passwords were never discovered. I'm betting their key was a full 168 bit random value that was immediately deleted when the leak came to light, so it's likely that value will never exist again in this universe. Compare that to something like LinkedIn (SHA1) where enthusiasts have cracked almost 97% of the passwords in that leak. How many more have blackhats cracked?

I certainly wouldn't rely on symmetric encryption alone to store passwords. If the password leaks, you expose all passwords in mere seconds. Plus you can see your user's plaintext passwords (since you have the key), which you should not be able to do. But as an extra measure symmetric encryption has already proven itself to be useful.

[0] https://xkcd.com/1286/

[+] dsacco|9 years ago|reply

It's a good system, especially compared with the current best practice of simply hashing passwords with bcrypt and calling it a day.

I can't recall it off the top of my head, but Facebook has a similarly impressive system with more secret sauce involved for performance at scale. I believe what they do is the following:

1. Hash the password with MD5(password).

2. Generate a 20-byte (160-bit) random salt (this is well over the 64 bits you'd need to defend against birthday attack collisions).

3. Hash with hmac_sha1(hash, salt).

4. Send this value to a separate server for further operations (mitigates offline brute-forcing).

5. Hash in a secret key with hmac_256(hash, secret). Note this operation is on a separate server. The secret key might be colloquially termed a "pepper".

6. Hash with scrypt(hash, salt) to make local computation slower.

7. Shrink the final value with hmac_256(hash, salt) for efficient database storage.

If any Facebook engineers are around, please correct me if I've missed or misinterpreted any part of that.

[+] daddykotex|9 years ago|reply

Can you explain a use case where wrong usage of a pepper actually degrade security? (just curious)

[+] novaleaf|9 years ago|reply

Glad to know that my self-designed system pretty much matches this "10/10" scheme :)

the only difference I do, is I perform the Sha512 hash client-side, so that the user's plain-text password isn't sent to my servers. Any thoughts on that?

[+] benhoyt|9 years ago|reply

The "obvious weakness" is the non-technical part of this: you can sign up to Dropbox (and most services for that matter) with an extremely weak password. I just signed up with a dummy email address and a password of "password". :-) If you look through password lists that have leaked online, the most common passwords are very easily guessable.

Anyway, not trying to dismiss their efforts here -- they're good. But this is only half of the equation.

[+] kijin|9 years ago|reply

A note about combining SHA512 with bcrypt: Don't feed the raw binary output of SHA512 into bcrypt. Use the hexadecimal or base64-encoded form instead. (Dropbox probably does this already, since they mention base64 in passing.)

bcrypt is known to choke on null bytes. Each SHA512 hash has a 25% chance of containing a null byte if you use the raw binary format.

Using hex or base64, of course, decreases the amount of entropy that you can fit into bcrypt's 72-byte limit. But you can still fit 288 to 432 bits of entropy in that space, which is more than enough for the foreseeable future.

[+] raverbashing|9 years ago|reply

And that's what Django does in this case, though it uses SHA256 as the first step https://github.com/django/django/blob/master/django/contrib/...

[+] MichaelGG|9 years ago|reply

What a strange thing to do. Is this due to some C-style string handling? Why should any hashing function treat input as text?

[+] benmanns|9 years ago|reply

Thanks for the reminder. You could encode with a "base255" algorithm that just excludes the null byte to retain more entropy within 72 bytes.

[+] Lucretiel|9 years ago|reply

It sounds like they do:

> For ease of elucidation, in the figure and below we omit any mention of binary encoding (base64).

[+] 0x0|9 years ago|reply

It's nice to store passwords securely, but it's also important to remember to, you know, actually verify them afterwards ;)

https://techcrunch.com/2011/06/20/dropbox-security-bug-made-...

[+] eropple|9 years ago|reply

That is literally five years old at this point and is at best, a cheap shot. Let's be better.

[+] borplk|9 years ago|reply

As someone who exclusively uses a password manager with random unique passwords for each service it always amuses me to see posts like this.

Years ago I relieved myself from the stress by using a password manager. Now for all I care they could be storing it in plaintext and it wouldn't make a damn difference to me. Problem solved.

[+] csharp|9 years ago|reply

It would still make a significant difference since someone could still compromise your Dropbox account... Having a password manager doesn't all of the sudden make all of your passwords secure on all of your different accounts.

[+] mentat|9 years ago|reply

Make sure you don't read any of Tavis Ormandy's tweets or posts about the state of password manager security then.

[+] Kratisto|9 years ago|reply

Just curious which one do you use?

[+] strictfp|9 years ago|reply

Unless someone steals the password to your password manager...

[+] unknown|9 years ago|reply

[deleted]

[+] cperciva|9 years ago|reply

We considered using scrypt, but we had more experience using bcrypt.

Ok, fair enough...

The debate over which algorithm is better is still open, and most security experts agree that scrypt and bcrypt provide similar protections.

... wait, what?

[+] tptacek|9 years ago|reply

They're badly expressing the idea that there may be only marginal benefits to optimizing the genus of password hashes used, so long as you're using a serious construction designed for storing (or generating keys from) passwords.

The debate over whether scrypt is better than bcrypt is not really still open. The debate over whether the difference matters that much in practice might be.

For what it's worth: for new systems, I use scrypt. But if someone asked, and they didn't have a very specialized application, I'd tell them that switching to scrypt from bcrypt, or even PBKDF2, would be a waste of money.

[+] red_admiral|9 years ago|reply

Here's how facebook does it: http://chunk.io/f/72f9c680ac2a4777b6dbf33c532e1d3c.jpg (Alec Moffat talking at RealWorldCrypto)

Seems like the combination of strong hash + encryption on a HSM is the way to go these days. Dropbox's scheme looks good to me.

[+] joepie91_|9 years ago|reply

One concern I have here, is that people are going to perceive this post as "this is what you should do and it's easy!", because the post doesn't really address the complexities of implementing this kind of thing.

As a result, we're probably going to have a bunch more issues like this one: http://blog.ircmaxell.com/2015/03/security-issue-combining-b...

I'm not looking forward to having to talk people off that particular ledge for the next several months...

[+] aomix|9 years ago|reply

Cool approach, you need to compromise two separate servers just to have a usable password database you could run tools against. A key compromise can be fixed quickly and a password compromise is useless without the key.

[+] dsl|9 years ago|reply

Of the last 10 or so security engagements I have done, I can only recall one where I wasn't able to compromise _all_ servers. Once you get the first few, the incremental work to get everything is relatively small.

When breaking in, your end goal isn't the database server... it's the domain controller or the configuration management server.

[+] sandGorgon|9 years ago|reply

Does anyone know what is a good practice to create a "vault" - the kind that is used for the Pepper in this case?

I have heard of it being a separate, ip restricted server with daily changing ip address, etc. A simpler use case would be to store oauth2 tokens or some kind of PII

[+] shawabawa3|9 years ago|reply

I've heard before of it just being stored in the codebase. Doesn't add much security but it does mean both the database server and at least 1 of the app servers or the codebase have to be breached

[+] evunveot|9 years ago|reply

> Some implementations of bcrypt truncate the input to 72 bytes, which reduces the entropy of the passwords.... By applying [SHA512], we can quickly convert really long passwords into a fixed length 512 bit value, solving [that problem].

This part confused me. How can truncating to 72 bytes be a more severe reduction in entropy than generating a 64-byte hash?

[+] grenoire|9 years ago|reply

Password lengths are variable. With passwords longer than 72 ASCII characters, you will lose entropy after that.

Let A be a 72 character long string, and B be A + X. Regardless of what X is, when bcrypted the result for A and B will be the same.

[+] CiPHPerCoder|9 years ago|reply

If you had two users with a very long password, but the first 100 characters were identical, they'd collide.

  - Pre-hashing makes this less likely.
  - Encoding the pre-hashed value to prevent NUL bytes is important.

But it's a bandaid solution, to be quite honest. We're better served by migrating to Argon2i, which doesn't have these quirks.

[+] microcolonel|9 years ago|reply

I think they're talking about entropy per bit. If they hash to 64 bytes, they integrate all of the entropy of the password, if they truncate to 72 bytes, they throw away all entropy past 72 bytes. This could be a huge problem if you're one of those people who uses a common prefix with a suffix as their password pattern for passwords they need to remember.

[+] OskarS|9 years ago|reply

If we use the global pepper for hashing, we can’t easily rotate it.

I don't get this point. Why is it harder to rotate pepper for a hash compared to an encryption key?

[+] jxcl|9 years ago|reply

Because encryption can be decrypted with the correct key. A hash can't be reversed once you've hashed something with the pepper.

[+] Achshar|9 years ago|reply

You'd have to recalculate hashes for every user?

[+] martinko|9 years ago|reply

A bit of an overkill, no? Doesn't bcrypt suffice?

[+] OskarS|9 years ago|reply

Assume their salt+hash database leaks. It's true that salting the passwords and using bcrypt would prevent mass cracking of the database (i.e. you would have to crack each password individually, not the entire database at once), it would still be feasible to crack a single user's password if it was weak enough (which would be worth it, if the user is, say, [email protected]).

Using pepper prevents that from happening, and storing it separately from your database makes it much harder to get both.

[+] ppierald|9 years ago|reply

I would be interested in the details of the storage mechanism of the global pepper. Is this in an HSM? For AWS customers, something like KMS? There are then huge operational and redundancy issues to think about. Failovers for your HSM. Handling the possibility that AWS might not be available or corrupt the key, other cases. These things are easy to whiteboard, but when the rubber hits the road and you need to think about all the operational edge cases, things get hard quick.

[+] dsacco|9 years ago|reply

It's not in an HSM. Dropbox states towards the end of the article that they're exploring HSM applications for pepper storage, which I think is a great idea. If I recall correctly, Facebook is also exploring (or has already implemented) an HSM for password database secret key storage.

You raise good points though. This system is significantly safer than best practices (bcrypt(password, 10)), but it has significantly more overhead. There's also diminishing returns here. For a company of Dropbox's size - sure, invest in this. For a company that came out of YC S16, no, don't bother. Just properly bcrypt/PBKDF2/scrypt/argon2 the thing and revisit much later.

I love it, but I would not recommend this system to my clients for password storage unless they had a very mature operations/reliability team.

[+] phonon|9 years ago|reply

Ummm..

"Going forward, we’re considering storing the global pepper in a hardware security module (HSM). At our scale, this is an undertaking with considerable complexity, but would significantly reduce the chances of a pepper compromise."

[+] yladiz|9 years ago|reply

Realistically, how much better is this than the standard bcrypt recommendation? I don't mean for a company the size of Dropbox/Facebook/etc., I mean in general, will this really be much more useful than just using bcrypt? Using an encryption key means that if the database is compromised, as long as the OS isn't (or wherever the key is being stored), the passwords are encrypted in a way that's effectively impossible to decrypt, which is nice. However, are they sure that hashing the password first before hashing it in bcrypt won't cause issues?

Unless Dropbox employs or contracted someone to verify that this is okay (not an engineer, a mathematician/cryptographer who can understand the math behind the algorithms) I'd be hesitant about it. Same goes for other companies that do some complex sequences of hashing e.g. Facebook. Implementing the idea is engineering related, but verifying it is not, and I don't trust engineers (including myself) to verify that a specific algorithm or sequence of algorithms is valid.

[+] faragon|9 years ago|reply

From the diagram, Dropbox stores no passwords: it stores an encrypted hash (hasing in two steps, SHA512 and then "bcrypt") of the password. I.e. stored = AES256(bcrypt(SHA512(password), per_user_salt, 10), global_key).

I would like to know if "salted-bcrypt"+SHA512 hashing is really safer than using just SHA512 (e.g. because of the risk of making locating hash collisions easier, etc.).

[+] cstrat|9 years ago|reply

I posted a separate comment, but will post here again...

Dropbox do store one password AFAIK. It appears that they store OSX users administrator password... I am keen to see if they address this somewhere.

See discussion: https://news.ycombinator.com/item?id=12457067

[+] CiPHPerCoder|9 years ago|reply

Their solution is very similar to the mode prescribed by [1] and implemented in [2].

There are actually two problems with bcrypt:

  - It truncates after 72 characters
  - It truncates after a NUL byte

If anyone is dead set on following Dropbox's example, make sure you aren't passing raw binary to bcrypt. You're playing with fire.

Additionally, if you're going to use AES-256, don't implement it yourself. Use a well tested library that either uses AEAD or an Encrypt then MAC construction.

[1]: https://paragonie.com/blog/2016/02/how-safely-store-password...

[2]: https://github.com/paragonie/password_lock

[+] warbiscuit|9 years ago|reply

Not sure I understand the purpose of a MAC in this case. What benefit does it provide to hash storage? If the attacker has write access to your database to tamper with the hash, they will mostly likely also be able to sign up as a user, and clone that (properly signed + encrypted) hash over to whichever account they want to log into. When cracking the hash, they'll just ignore the MAC.

[+] arielb1|9 years ago|reply

nitpick: encrypt-then-MAC is an AEAD construction.

[+] figers|9 years ago|reply

How dropbox "NOW" securely stores your passwords

[+] oDot|9 years ago|reply

While this is very impressive, it feels like trying to solve the wrong problem. The real problem is getting rid of passwords (Persona, anyone?).

Don't get me wrong, what's described there is super-important to secure the authentication of today, but what about a word for the authentication of tomorrow?

There already are various solutions. Passwordless[0] is a familiar one for nodejs, and I recently bumped into the promising Portier[1], which is, according to its authors, a "spiritual successor to Mozilla Persona".

[0] https://passwordless.net/

[1] https://portier.github.io/

[+] Jahava|9 years ago|reply

The blog mentions, "We’re considering argon2 for our next upgrade". I suppose they could do in-line upgrades: as users are signing in, the SHA512 is piped through the old pipeline for verification and through the new pipeline for migration. As far as I can tell, there's no way for them to swap bcrypt out for argon2 using just their cold store.

[+] Freaky|9 years ago|reply

> Some implementations of bcrypt truncate the input to 72 bytes, which reduces the entropy of the passwords. Other implementations don’t truncate the input and are therefore vulnerable to DoS attacks because they allow the input of arbitrarily long passwords.

Huh? BCrypt works by stuffing the password into a 72 byte Blowfish key and using it to recursively encrypt a 24 byte payload. Either it's truncating, or it's pre-hashing the password to fit much like they are.

The link they use to justify it is funny: http://arstechnica.com/security/2013/09/long-passwords-are-g...

That's just a naive PBKDF2 implementation that's pointlessly reinitializing the HMAC context each iteration instead of just doing it once at the start. The difference between storing a 1 byte and a 1MB password with PBKDF2 should be on the order of a couple of milliseconds.

[+] arielb1|9 years ago|reply

Having the SHA-512 hash at the beginning simplifies the implementation because the "security" code only needs to handle 64-byte random strings (which are truncated to 54-byte strings for `bcrypt`, but still...). That removes all sorts of stupid edge cases that come with variable-length strings.

183 comments