top | item 18944187

Prevent users registering with passwords from data breaches

80 points| DivineOmega | 7 years ago |jordanhall.co.uk | reply

123 comments

order
[+] dwighttk|7 years ago|reply
I need password fields to:

1)not silently fail when I try a 64 character (or 32 character) password

2)not fail and say my password is "too short" when it is 32 characters and you have an unrevealed maximum password length of fewer characters than that.

3)just all-around quit failing when my password is totally fine, it's a quasi-random string of letters, numbers, and symbols and I'll never type it...

oh yeah

4)don't disable pasting in the password field

4b)IF you do, freaking let me see what I've typed. I promise to not enter the password in a place someone can shoulder surf. Trust me that's WAY down on the threat list for my life.

Then, and only then, if you really want, keep me from registering with a breached password. But you had better tell me that is what is going on.

[+] cm2187|7 years ago|reply
And not complain it doesn’t have enough entropy because it lacks special characters when it is a hex of a 128bit key!

What I find the most annoying is that developers enjoy creativity for password validation. Some requires special characters, but only from 2 or 3 allowed special characters (what good does it even do in term of entropy???). Some limit the number of times a character can appear in a password, and I understand that it is meant to prevent people using “aaabbbccc” as a password but it also makes long passwords impossible. Many have an artificially low length limit (why would they even do that)?

[+] 0db532a0|7 years ago|reply
There’s something that I don’t really understand regarding password managers which maybe you could explain. How is using a password manager to manage multiple passwords more secure than using a single password everywhere?

In the case of a password manager, if your computer is breached, then all passwords are breached. If your passwords are hosted encrypted on a website, then if that website is breached, the master password you send it will be visible to the attacker, and thus all passwords are breached.

In the case of using the same password everywhere, if one website is breached, then the password to all sites is breached.

Am I missing something?

[+] Dylan16807|7 years ago|reply
64? How much entropy is in the passwords you're pasting? While holistically I like to see maximum lengths of 200+, personally I'm satisfied with 20 characters holding 119 bits of entropy.
[+] hannob|7 years ago|reply
All of those rely on Troy Hunt's API. Which may be fine for some people, yet others may uncomfortable introducing an external dependency. I generally recommend avoiding external API dependencies if you can.

Here's a python implementation using bloom filters which avoids storing the whole list (need to store ~1 gig), yet still gives you very good accuracy: https://gist.github.com/marcan/23e1ec416bf884dcd7f0e635ce5f2...

It's more what I think this should look like.

[+] bartread|7 years ago|reply
Tend to agree on third party dependencies. They're a useful leg up if you're just prototyping, testing an idea, or generally groping around for product/market fit, but I'm not in love with them as a long term solution - certainly not for anything that's core to your product or service.

Too many potential pitfalls:

- If their service goes down, you may go down too, or you have the added complexity of gracefully handling the situation when they're down

- You have no control over their roadmap... which means you might suddenly get a bunch of non-value-adding but totally essential work dropped into your backlog, and perhaps at short notice, because of changes they make

- Perhaps they go out of business or, for whatever reason shut down their service: again, congratulations, you've just got a load of work you didn't bank on getting in the way of delivering your own roadmap

I think most of us have probably seen multiple examples of the above, if not at first hand, then posted on HN or elsewhere on the web.

I'm not an NIH kind of guy but I do tend to prefer libraries, or co-located installs for long-term dependencies. That way at least you can manage migrations and updates according to your own agenda rather than somebody else's.

[+] ThePhysicist|7 years ago|reply
Similarly I also wrote a locally hostsable Golang-based REST web service that you can use to check plain passwords or hashes against the HIBP database (and other dbs). It’s based on an optimized Bloom filter library and pretty fast. It also provides a CLI tool and libraries for Python and Go:

https://github.com/adewes/have-i-been-bloomed

[+] matheweis|7 years ago|reply
There’s no reason you can’t do both... simply build your internal code to check against multiple sources. You can asynchronously hit your bloom filter, HIBP, DeHashed, etc. and cut short whenever there is a hit.

In this way you get the best of all worlds; speed, highest degree of accuracy, and reduced dependency on a single external API.

[+] hiccuphippo|7 years ago|reply
Ha, I was just about to ask for a bloom filter version of this. Thanks!
[+] air7|7 years ago|reply
This is over the top. Even enforcing password complexity is over-rated.

For online attacks, an attacker can't even try the top 1000 passwords on for an account in any major website in reasonable time without triggering the alarm, as they all(?) have rate limiting (usually in the form of account lockdown after single-digit failed attempts).

For offline attacks, there first needs to be a breach. While they undoubtedly happen, they are very infrequent events. But once they happen, you should assume all passwords would be cracked very fast. Hackers can get their hands on a lot of computing power, and the brute-forcing attempts are not alphabetical, but rather clever how-humans-chose-passwords models. You're relatively safe because of the low frequency of breaches, not because a hacker trying trillions of passwords a second will be frustrated by your password choosing policy. I'm sure most of the passwords from the breaches would be attempted anyway.

Credential stuffing[0] is the real issue. If there was an API to test that a user isn't using this password on other websites, that would be very useful.

[0] https://www.owasp.org/index.php/Credential_stuffing

[+] epriest|7 years ago|reply
> For online attacks, an attacker can't even try the top 1000 passwords on for an account in any major website in reasonable time without triggering the alarm, as they all(?) have rate limiting (usually in the form of account lockdown after single-digit failed attempts).

This is empirically a practical attack: attackers successfully executed a common password brute force attack against GitHub in late 2013 by using a botnet with 40,000 distinct remote addresses:

https://github.blog/2013-11-20-weak-passwords-brute-forced/

[+] hombre_fatal|7 years ago|reply
Our solution for a bitcoin casino was to generate passwords for users. But you can imagine how few sites can get away with such a thing. Our create-password input was a disabled textfield with a reroll button.

Before that, attackers would just wait for new usernames to appear on the scoreboard/chat and check them against password dumps. The easy come, easy go nature of bitcoin made it particularly lucrative.

Password reuse is a massive issue. At a glance, one might wonder why most sites need to care so much since they don't deal with money. Who cares about a forum like HN? But consider that it's easier to audit/impede new accounts with anti-spam measures, so there's value in taking over old accounts. And you don't want people with moderation tools getting attacked either. Ideally, it's nice to be able to trust an account with 1,000 posts more than one with 0 posts, but that evaporates when accounts are easily stolen.

Aside, how do you implement account-locking without making it trivial for users to DoS each other that way?

[+] spacehome|7 years ago|reply
I'm not so sure about that. Hire time on a botnet so you have lots of IPs and start trying mixed usernames with mixed passwords. You won't have any choice over which username you eventually break in with, but I don't think you'd trip any alarms either.
[+] scarhill|7 years ago|reply
Making credential stuffing harder is the main reason to do this. Credential stuffing works because users reuse credentials across sites. If someone attempts to use a password from the HIBP database, the two most likely cases are that it's extremely common or the same person is reusing it. Extremely common passwords are bad for all sorts of reasons and the same person reusing a breached password makes the account vulnerable to credential stuffing.
[+] ss248|7 years ago|reply
You're right. I don't know why it's still not a common knowledge that nobody bruteforces the login pages in this day and age. But I guess too many people, who have products to sell, benefit from the status quo, so things are not going to change.
[+] cpburns2009|7 years ago|reply
And I thought we were getting away from arcane rules for passwords. Now you have to avoid every compromised password from any unrelated account? I may use random passwords, but I don't expect the typical consumer to do the same. Sometimes I simply don't care about security for a one off account on a free service where I'll happily use the simplest permutation of "password" for the password.
[+] pjbster|7 years ago|reply
I think every entity which holds a password should adopt the same policy: from the moment the password is created, the entity will undertake to do everything it can to try and break it. Whatever it takes - bot farms, 3rd-party white hat outfits, social engineering in the canteen, whatever. Anything (legal) goes.

Once the password is broken, the account will immediately be placed in suspense until the owner creates a new password. Which, of course, immediately gets fed back into the machine and life goes on.

This would eliminate password rules. If you want to create a password which consists of, say, 100 consecutive zeros, go for it. But you might only get to use it for a fraction of a second if the network can break it quickly.

[+] Dylan16807|7 years ago|reply
Most people don't know how to make a good password. So don't let them make a bad password, and make that easy by providing a button that generates good ones.

"It has to be new" is about as elegant as you can get for password rules. You can use any secure method you want. Whatever characters you want, just don't do it wrong. If you want it to be easier, press the button.

Even with a one-off account, a complex password you don't bother to remember is better than a weak password, because the service won't have to deal with tons of misused accounts.

[+] geofft|7 years ago|reply
The existence of password breach databases means that attackers are already attempting other people's passwords against your accounts.

We should honestly move to the world where typical consumers are using password managers that generate passwords randomly. I think it is pretty reasonable to expect the typical consumer to install and use a password manager; I think it's pretty unreasonable to expect the typical consumer to generate memorable, strong, and unique passwords for each website and store them in their head.

[+] rmtech|7 years ago|reply
If the service doesn't care about security at all then you could just not have a password, log on using your email.

If it cares a little bit then you have to characterize exactly how bad it would be if the account was compromised. The right solution might be to just accept any non-blank password that's less than 32 chars or something.

[+] thaumaturgy|7 years ago|reply
You don't even need to call out to an external API to get good coverage for this rule, in case you're averse to doing such a thing.

You can just do a case-insensitive match against this file that I compiled a while back: https://github.com/robsheldon/bad-passwords-index

It includes the most commonly reused passwords according to in-the-wild breaches.

I'm a bit embarrassed to see that it's been 2 years since the last update. I was thinking recently about updating this again. I think I'll do that.

[+] CM30|7 years ago|reply
So what happens when a significantly large percentage of 'standard' passwords are disallowed and your average Joe just can't be bothered to create another one?

Seems like eventualy you'll end up driving away everyone who isn't tech savvy/doesn't use a password manager/randomly generated password, which seems like something that'll significantly limit your site or app's audience.

I get the logic behind it, and it's a neat idea on a security level, but it seems like a guaranteed way to drive your userbase to your competitors by making it annoying to sign up.

[+] pbhjpbhj|7 years ago|reply
If you need your users to have a secure password then that's a good thing. We need to make password managers simple enough, and default, for all users.
[+] tialaramex|7 years ago|reply
> So what happens when a significantly large percentage of 'standard' passwords are disallowed and your average Joe just can't be bothered to create another one?

This type of question should always be asked with an "XKCD What If?" or "Back of the envelope" calculation attached so that when responding people can respond to your numbers rather than a vague intuition like "large percentage of standard passwords".

This way when doing the calculation you sometimes go "Oh, I see now" and then you don't have to post the question at all because you learned something on your own just like if you are wondering what a "password" is and then check a dictionary you don't have to ask "What's a password?"

What's a "significantly large percentage?" Let's suppose it's 10%. If 10% of these "standard passwords" are disallowed, and a user is only willing to try twice before giving up then that means 1% of users picking random "standard passwords" can't sign up, which we could argue is a noticeable problem for your site.

Next we need to ask ourselves what's a "standard password"? Do we just mean "a password from the top 100 most common passwords list?". If so game over, we've decided by policy to have users with bad passwords that will be guessed, their accounts will constantly get broken into, and I guess now we need to figure out why our service is "compelling" enough that you want to pay for it anyway despite the constant break-ins.

So let's suppose we have a broader definition of "standard password". Maybe it's 8 characters chosen from A-Z. That sounds like a "standard" password my mother would choose. That's over 200 billion "standard passwords". Troy's service blacklists, wait for it... 500 million passwords.

Let's guesstimate that the list grows by half that amount every single year - meaning 250 million new unique passwords are revealed by idiots every single year. We should hit that concerning 10% rate no sooner than... 78 years from now. If in that time we cannot come up with anything better than "try to memorise a unique password eight letters long" then that's a far more serious technical failure.

The real problem is that people think they must be the only person in the entire world to have chosen "iluvlucy". That's a "standard password" by our definition above, but it's also ludicrously obvious. Pwned Passwords lets us distinguish "iluvlucy" from "xlvghydm" not by some crappy heuristic that would also catch other things but by the fact that lots of people already used it and had the password revealed.

[+] draugadrotten|7 years ago|reply
What is the latest and greatest in password creation rules anyway? It is hard to define a password policy which covers both users' practical needs and still makes ERP systems secure and follows best practice of the major players.

Microsoft research has an interesting paper on it. Are there more like this out there? https://www.microsoft.com/en-us/research/wp-content/uploads/...

Hints welcome!

[+] 21|7 years ago|reply
This is a terrible idea which will backfire.

Many users have a "universal weak password" for sites that don't really matter, now you will be forcing them to jump through hoops just because so.

[+] jperry2019|7 years ago|reply
It might be frustrating for users to have their typed password rejected without an explanation. Since there’s no authoritative list of compromised passwords they would have to take your word for it.

Wouldn’t a better solution be to increase password requirements until users are forces to generate one using a password manager? If you can memorize a password it’s probably not secure.

[+] blackflame7000|7 years ago|reply
I feel like if they have already managed to gain access to your hashed passwords 9/10 times its already game over
[+] masa331|7 years ago|reply
Please don't use passwords at all. They are wrong for so many reasons. Use emailed sign-in links.
[+] pbhjpbhj|7 years ago|reply
Which, in general rely on your users email password .. which may not be as secure as their bank password, because "email doesn't handle money".

So, to access your bank now crackers just need to get in to your email (which may be true anyway, of course 2FA helps in both cases).

[+] MarkMc|7 years ago|reply
If you say to the user, "sorry but that password is too common - please try again" then the user will simply add a 1 to the end of the password and press submit. That doesn't offer much improvement in security.
[+] mnutt|7 years ago|reply
If that happens commonly, the original password with a 1 appended will probably eventually appear in a future HIBP database. In fact, the user could continue adding 1s until they either give up and try something different, or until it becomes uncommon enough.
[+] throwawaymath|7 years ago|reply
This is not a good idea, as implemented. You shouldn't disallow a user from using an otherwise strong password just because it's detected in a breach unless you can definitively see that it's already associated with their email address or username.

The logical conclusion of a password checking system like this is that this password:

    ZBjHWJd$8XbJhY7LQvkmARBW)p7xgiDzDw}iMLLw
can no longer be used by anyone, because I've just "breached it" by posting it on Hacker News.
[+] rmtech|7 years ago|reply
If the password actually has a lot of entropy but it appears in a breach then that's some fairly strong evidence that the user is reusing it.

Specifically if it appears n times in the HIBP database you should assign at least roughly 1/n probability that the user is reusing it.

So if you assign disutility -V to letting a user have a known username + password combo and utility U to letting a user sign up with a known password but unknown username, the utility is (n-1)/n×U - 1/n×V

Reasonable values of U and V for a given site will be different depending on the application, but for online banking -V would be maybe -20 and U might be negative as well. You wanna bank with a public password lol? For something like gmail or Facebook it would be the same story.

On the other hand if the password is quite weak then it's vulnerable to credential stuffing. If it appears, say, 10,000 times in the HIBP database then most likely it's as good as public whether or not the user account name is known.

Maybe there's a sweet spot around 50 instances where you can't really credential stuff it, and you also aren't that sure that it's a reuse.

In terms of usability you could tell the user to change it up a bit, add some words.

For example, r0bbiewilliams appears 5 times in the database. luvrobbiewilliams appears 0 times AND IS PROBABLY EASIER TO REMEMBER!

You can almost always get away from a breached password by adding a small amount of text.

[+] Buttons840|7 years ago|reply
The odds of a user randomly selecting that password, rounded to 20 decimal places, is 0.
[+] echelon|7 years ago|reply
If they use an already compromised password, they're prone to a dictionary attack.

edit: I did try logging into your account with the password you posted. :P

[+] stormbrew|7 years ago|reply
That said, I know of one event where someone got into a bunch of accounts on a site and did some real damage by using known username password combinations and preemptively forcing specifically those accounts to change their passwords and blocking them from being used would have prevented the outcome that the site invalidated literally every user's password instead.
[+] bijection|7 years ago|reply
Any known password is no longer a particularly strong one.
[+] rmtech|7 years ago|reply
Just a quick point that is worth considering: If a user types in a password and that password appears once in the HIBP breach list, then it is extremely likely that the source of the password in the breach list IS that user.

If it appears 2-3 times, then there is still a significant chance that that user is the source of the password getting into the HIBP database.

And if that user is the source, then the bad guys most likely know that user's email and password, and their account is wide open.

[+] scarhill|7 years ago|reply
Exactly. I think of the HIBP password list as having three types of passwords (this is an oversimplification, but bear with me):

1) Extremely weak ones that lots of people use (e.g. 'password1') 2) Somewhat unique ones (their pet's name and birthday) 3) Truly strong ones (random, long strings)

I don't want users on my site using type 1 passwords at all. If a password is really type 3, the odds say that no user will ever try to use it again, so there's no collateral damage in blocking it. The person signing up with a type 2 is almost certainly the same user whose credentials are in the breach. I don't want them to reuse that password on my site because it makes their account vulnerable to credential stuffing.

[+] scoot_718|7 years ago|reply
So they'd have to be testing in the browser. Seems unlikely.
[+] CamperBob2|7 years ago|reply
That is going to be insanely annoying.
[+] scarhill|7 years ago|reply
Could you explain? Unless you're using a really weak password or reusing passwords, how would it affect you?