Bcrypt is a password hash, not a KDF, which is the way it was used in this API. It's super unclear to me why they wanted a string-based KDF here at all; does anyone have more context?
I've in the past been annoying about saying I think we should just call all password hashes "KDFs", but here's a really good illustration of why I was definitely wrong about that. A KDF is a generally-useful bit of cryptography joinery; a password hash has exactly one job.
They didn't want a KDF, as far as I know, but they wanted a hash function with unlimited input size.
Including the username in the hash input gives you guaranteed domain separation between users that you don't get from salts/nonces. Its a generally good idea if you have a hash function with unlimited input size (all modern cryptographic hash functions except bcrypt have unlimited input size).
So let me take on the burden of stupid here: how are a password hash and a string-based KDF different? (I mean, the oldest well-known example of the former literally calls itself a PBKDF.) I understand this particular function from strings to large fixed numbers was limited in the length of the string it would accept, and I agree that’s a problem, but it feels like a problem orthogonal to the distinction you’re drawing.
The value is the combination of userid, username, and password, so in threads on other platforms people have hypothesised that the developer tried to play it safe and use a password hash because of the password's presence.
Also I'm not sure the average developer understands the distinction.
Unfortunately the industry defined Bcrypt as a KDF for some time, even though it is better named as a "password hashing function". Cryptography has a history of being bad at picking good names for new work.
In addition to (true) KDFs, people often want a HKDF (HMAC-based Key Derivation Function) or hierarchical deterministic key derivation function (HDK).
I believe there have been earlier protocols where the user’s secrets were used as a KDF to generate credentials in such a way that the server never sees the user’s password.
I feel gross calling a function that just blatantly ignores part of its input a hash, much less a password hash. It's like calling a rock a fish, because they're both in water, despite the lack of swimming. In any case, a hash that ignores some of its input is certainly not a cryptographically secure hash, so why is it being used for crypto?
I can see the incident was a jumping off point to talk about bad APIs (bcrypt probably should error >72) but it sounds like the actual bug was they weren't checking the value in the cache matched the data they used in the hash for the key. The authentication cache check should survive any arbitrarily bad hashing algorithm because all of them are going to have collisions (pigeonhole principal). Even an arbitrarily 'strong' hash function with no input truncation, as long as it has a fixed width result, will have this property. Thus, any arguing in the comments here about different hash functions with different truncation properties is moot.
The analogy is something like creating a hash map whose insert function computes the slot for the key and unconditionally puts the value there instead of checking if the keys are the same during a collision. No amount of tinkering with the hash function fixes this problem. The algorithm is wrong. A hashmap should survive and be correct even giving it a hash function that always returns 4.
Something similar happens in my company too. In a particular place, we use the hash of a string as the key in a hashmap instead of the string itself, because the hash is smaller and is easier to compare after the initial map has been made. It is a 64bit hash too. I have been crying about this everytime it comes up, and the response is, it will never happen. My problem is that we will never know if it ever happens too.
Not sure about that. A hash function suitable for security sensitive work, used properly, should make a collision so unlikely that you can basically forget it that it's even possible.
Think about it, that's what hashing passwords relies on. We don't store a plaintext password for a final check if the password hash matches, we count on a collision being basically impossible.
A hashmap is different, because it's using a much weaker hash function with far fewer security guarantees.
Plus, you're assuming the original values are even kept around for comparison. The cache key likely just mapped to something simple like a boolean or status flag.
I would guess that they felt comfortable that the bcrypt output (192 bits) is enough that collisions are very unlikely. If these were already partitioned by customer, rather than being a single cache for the entire Okta userbase that seems fine. You're going to have weird cosmic ray bugs more often than a natural collision.
Now, the data structure they're using for a cache will use some sort of hash table, likely in memory, so maybe they've got the 192-bit bcrypt "key" and then that's hashed again, perhaps well or perhaps badly [e.g. C++ really likes using the identity function so hash(12345) = 12345] but then a modulo function is applied to find the key in an index and then we go groping about to look for the Key + Value pair. That part the API probably took care of, so even if the hash has size 6-bits, the full 192-bit key was checked. But the original data (userid:username:password) is not compared, only that 192-bit cache key.
I've experienced that Most incidents, whether it's security, performance, or availability, etc. rarely have one single thing that goes wrong. Theres's a chain of events that happen.
Poor API design can make it easier for other contributing factors (checking cache here, but could also be not running load tests, not fuzzing, human error, etc.) to cause incidents.
I'm glad to see this come out, plus which libraries handle out of bounds conditions with errors vs. fix-up the input to cause silent failures.
I strongly agree with the conclusion that the libraries should reject input they can't correctly handle instead of silently truncating it.
I co-maintain a rate-limiting library that had some similar rough edges, where it wouldn't always be obvious that you were doing it wrong. (For example: limiting the IP of your reverse proxy rather than the end user, or the inverse: blindly accepting any X-Forwarded-For header, including those potentially set by a malicious user.) A couple years back, I spent some time adding in runtime checks that detect those kinds of issues and log a warning. Since then, we've had a significant reduction in the amount of not-a-bug reports and, I assume, significantly fewer users with incorrect configurations.
I enjoyed the article and the detailed analysis for different languages. The conclusion is probably the part where most of the disagreement lies. API design is is not really at fault here if we consider the purpose of the API and the intended output.
The API was designed to generate a hash for a password (knowledge factor) and for performance and practical reasons a limit has been picked up (72). The chances that some one knows your first 72 characters of password implies that the probably is a lot higher for the abuser to have remaining characters too.
While smaller mistake here in my opinion was not knowing the full implementation details of a library, the bigger mistake was trying to use the library to generate hash of publicly available/visible information
There's nothing wrong with a limit. The problem is that the library silently does the wrong thing when the limit is breached, rather than failing loudly.
This is a completely unreasonable API. It reminds me of the `mysql_real_escape_string` vs. `mysql_escape_string`. The default API must be the strict one. You should be able to configure it to be broken but silent truncation is an insane piece of functionality. There is no universe in which this is logical. One might as well just have everything return void* and then put in the documentation what type to cast to. The invariant is clearly a historical accident.
As a mistake, it's fine. Everyone writes up things like that. But defending it as an affirmatively good decision is wild.
Have they did a bcrypt(password + userId + username), it won't be so bad. Order of entropy is important.
Also I'm not sure what functionality the authentication cache provides, but their use of bcrypt(userId + username + password) implies the password is kept around somewhere, which is not the best practice.
OT. Has Argon2 basically overtaken Bcrypt in password hashing in recent years?
> Have they did a bcrypt(password + userId + username), it won't be so bad. Order of entropy is important.
That depends on how exactly it was used. If it was simply used to check if previous authentication was successful (without the value containing information who it was successful for) then single long password could be used to authenticate as anyone.
The bcrypt implementation in the Zig standard library has both the bcrypt() function (where truncation is explicitly documented) and the bcryptWithoutTruncation() function (which is recommended and automatically pre-hashes long passwords).
It's great to hear that Zig covered both cases. However, I'd still prefer the opposite behavior: a safe (without truncation) default `bcrypt()` and the unsafe function with the explicit name `bcryptWithTruncation()`.
My opinion is based on the assumption that the majority of the users will go with the `bcrypt()` option. Having AI "helpers" might make this statistic even worse.
Do you happen to know Zig team's reasoning behind this design choice? I'm really curious.
Seems odd to keep the broken one as the default[0], make the "recommended" one so much longer, and provide none which simply errors on overlong passwords.
Also neither seems to warn about the NUL issue.
[0] I assume for compatibility purpose, but it still seems very dubious.
Password hash functions are designed to be slow, are designed to be use with salts, and may have low entropy inputs.
Hash functions themselves are general purpose and don't protect against low entropy inputs (low entropy passwords). They also don't protect against rainbow tables (pre-calculated digests for common or popular passwords). For password hashing you want something slow and something with unique entropy for each user's password to prevent rainbow attacks.
It doesn't solve the problem of weak passwords, but it's the best that can be done with weak passwords. The only improvement is to enforce strong passwords.
In Node, you would commonly reach for the builtin core "node:crypto" module to run cryptographic functionality like this. I wondered why that wasn't used here, but bcryptjs was. After digging into it a little, node doesn't ship with core support for bcrypt, because it's not supported by OpenSSL.
The node crypto module is essentially an API that offloads crypto work to OpenSSL. If we dig into OpenSSL, they won't support bcrypt. Bcrypt won't be supported by OpenSSL because of reasons to do with standardisation. https://github.com/openssl/openssl/issues/5323
Since bcrypt is not a "standardised" algorithm, it makes me wonder why Okta used it, at all?
I remember in uni studying cryptography for application development and even then, back in 2013, it was used and recommended, but not standardised. it says a lot that 12 years on it still hasn't been.
Reminds me of when I saw a junior developer calling SHA-1 on an incrementing integer ID, with no salt. We had a long talk about it, he thought it was too "scrambled" to allow anyone to recognize what was being done. He shouldn't have been so junior, he was 4 or 5 years into his career. I had to be the bad guy and override his decision without further discussing why it was a bad idea, and I really tried for a good 45 minutes to explain things. He got it a week later when I showed him rainbow tables, and I felt bad having to tell him to just do what I said for the solution, but sometimes you just have to make the decision to say "do what I said, I'm sorry you don't understand, I tried to explain."
I've seen this before, a belief that just because the output looks random that it is secure. It's like storing license plates -- just hashing them without additional seasoning is of little use, because the number of possible license plates is so low that they can easily be brute forced.
Similarly, a developer I worked with once claimed that CRC32 was sufficient verification because CRC32s changed so drastically depending on the data that they were difficult to forge. He was surprised to find out not only is it trivial to update a CRC32, but also to determine the CRC polynomial itself from very few samples.
Rainbow tables is not the (only) reason you dont want to hash something low entropy like an incrementing int, and adding a salt wouldn't make this secure.
[Im assuming the usual definition of salt where it is known by the attacker... a pepper would be fine]
That is such a rookie mistake. It's not some hidden information that bcrypt has a 72 char limit. Pretty widely documented in multiple implementations and languages.
How does a company whose only job is security screw that up so badly?
On one hand, I have never heard of a password hashing algorithm that truncated to 72 characters. I assumed all one-way hashing functions were arbitrary length input. "arbitrary input -> fixed output" has always been part of the definition of hash, to me.
On the other hand, I'm not a security developer at Okta.
On the other hand, why not have implementations assert if they are given a string longer than 72 chars? It feels to me like no-one would ever do that on purpose, so it's a massive issue which is easy to accidentally make with a really important function.
I was curious how bcrypt-ruby would handle this. It does not throw an error for input length larger than 72. However the actual API for the gem makes it pretty clear its for hashing a password, and not just hashing in general - as you can see from the code.
Hold on, in the Rust example, how does `err_on_truncation` get set? TFA completely ignored that there's a setting somewhere (probably incorrectly defaulting to false)
the rust library exposes a handful of "non_truncating_*" functions that enable error handling. i would expect this to be for drop-in compatibility with old code.
amusingly, the python "library" is just a thin wrapper around the same rust library.
protip: a lot of cryptography primitives actually aren't that complicated in terms of the code itself (and often can be quite elegant, compact and pleasing to the eye). if it's important, probably worth just reading it.
it's what people wrap them with or the systems they build that get messy!
I'm confused, it seems that the OP wants to use Bcrypt as an encoding/decoding utility.
About solutions, Django hashes by default the password only with a salt. I'm not sure why it would be valuable to combine user_id+username+password. I've always assumed that using salt+password was the best practice.
Regarding the API design, I agree now with OP after reading other comments on HN. The API would be improved if it clearly indicates to the user when truncation is done, even if this understanding is implied by principle.
I've seen this multiple times - even better I don't know how many ways we found a simple workaround or bypass of the complete process in so many apps... In essence this has nothing to do with the API itself but the way in which is another ballgame altogether. Great post though.
Can someone explain, in clear layman terms, what the difference is between a password hash and a KDF? I have went through this whole thread and tried to look around online but I still don't understand.
Password hash is designed for matching: take the salt, add it to the password, run it through the hash, compare it to the stored hash. The important properties are:
- MUST be non-reversible, including against tricks like "rainbow tables"
- should be somewhat expensive to discourage just trying all possible passwords against a (leaked) hash
KDF is a key derivation function. The value will be used as a key in, say, AES. The important properties are:
- should distribute entropy as well as possible, across the required width of output bits
- reversibility less important as the derived key shouldn't be stored anywhere
- may or may not want artificially inflated cost to discourage cracking
A password hash is a simple hash of a password. Hash algorithms are made to be fast. KDF - key deriving functions - are slow by design and are made to derive a key from a given string. They are designed to be slow to make password searching slower. This is a 2c tour of the topic.
Another incident at Okta? Oh no! Its security has _always_ been a mess. It's a dumpster fire and no client of their cares because their identity systems are so messed up, that it's better to have the mess of Okta, than the mess they are sitting on. It's kinda crazy they get away with such incredibly bad security practices. Like... this bcrypt issue has been know for a LONG while. We used to test for it 8-10 years ago.
There's either (1) nobody competent enough there to know (which is likely not true, I had a pentester friend recently join, and she is very good), or, more likely (2) management doesn't care and/or doesn't give enough authority to IT security personnel.
As long as clients don't have any better options, Okta will stay this way.
> no client of their cares because their identity systems are so messed up, that it's better to have the mess of Okta, than the mess they are sitting on
Yes, this is very true.
Also some companies realize that they can screw up royally because they do no have the proper knowledge, and authentication is not a core business of theirs.
I can understand them. I also use mail systems I am not that happy with, but I have this comforting idea that if they have a problem, 3B people are waiting together with me for it to be solved, and that's the kind of pressure that helps.
PHP's password_* functions make it difficult to misuse in this particular way. There's no function in that API which hashes a password with a controllable salt and returns the result; there's only password_hash(), which always uses a random salt, and password_verify(), which rehashes a password internally and returns a bool for whether it matched.
(It's still got the truncates-at-72 problem with PASSWORD_BCRYPT, though.)
Simply hashing your data (using an established hashing algorithm/library combo) to later compare two hashes in order to check whether the data has changed doesn’t usually feel like rolling your own crypto.
I would bet that if you surveyed working programmers 9 out of 10 would say that they thought bcrypt() was a “KDF designed by professionals”. The treacherous API is not as well known as it should be.
tptacek|1 year ago
I've in the past been annoying about saying I think we should just call all password hashes "KDFs", but here's a really good illustration of why I was definitely wrong about that. A KDF is a generally-useful bit of cryptography joinery; a password hash has exactly one job.
pclmulqdq|1 year ago
Including the username in the hash input gives you guaranteed domain separation between users that you don't get from salts/nonces. Its a generally good idea if you have a hash function with unlimited input size (all modern cryptographic hash functions except bcrypt have unlimited input size).
mananaysiempre|1 year ago
masklinn|1 year ago
Also I'm not sure the average developer understands the distinction.
lelanthran|1 year ago
I remember :-): https://news.ycombinator.com/item?id=42899432
sandeepkd|1 year ago
Product - we need to provide service availability even if the AD is down
Engineer - Ok, may be we can store the ~credentials in cache
Security - oh, in that case make sure everything in cache is hashed properly with the recommended Bcrypt algorithm
Engineer - We got the approval from the security, we are in much safer zone, lets deliver and get a win
Zamicol|1 year ago
In addition to (true) KDFs, people often want a HKDF (HMAC-based Key Derivation Function) or hierarchical deterministic key derivation function (HDK).
hinkley|1 year ago
I’m wondering if okta was inspired by those.
dataflow|1 year ago
I feel gross calling a function that just blatantly ignores part of its input a hash, much less a password hash. It's like calling a rock a fish, because they're both in water, despite the lack of swimming. In any case, a hash that ignores some of its input is certainly not a cryptographically secure hash, so why is it being used for crypto?
jhhh|1 year ago
The analogy is something like creating a hash map whose insert function computes the slot for the key and unconditionally puts the value there instead of checking if the keys are the same during a collision. No amount of tinkering with the hash function fixes this problem. The algorithm is wrong. A hashmap should survive and be correct even giving it a hash function that always returns 4.
purist33|1 year ago
semicolon_storm|1 year ago
Think about it, that's what hashing passwords relies on. We don't store a plaintext password for a final check if the password hash matches, we count on a collision being basically impossible.
A hashmap is different, because it's using a much weaker hash function with far fewer security guarantees.
Plus, you're assuming the original values are even kept around for comparison. The cache key likely just mapped to something simple like a boolean or status flag.
tialaramex|1 year ago
Now, the data structure they're using for a cache will use some sort of hash table, likely in memory, so maybe they've got the 192-bit bcrypt "key" and then that's hashed again, perhaps well or perhaps badly [e.g. C++ really likes using the identity function so hash(12345) = 12345] but then a modulo function is applied to find the key in an index and then we go groping about to look for the Key + Value pair. That part the API probably took care of, so even if the hash has size 6-bits, the full 192-bit key was checked. But the original data (userid:username:password) is not compared, only that 192-bit cache key.
rcleveng|1 year ago
Poor API design can make it easier for other contributing factors (checking cache here, but could also be not running load tests, not fuzzing, human error, etc.) to cause incidents.
I'm glad to see this come out, plus which libraries handle out of bounds conditions with errors vs. fix-up the input to cause silent failures.
nfriedly|1 year ago
I co-maintain a rate-limiting library that had some similar rough edges, where it wouldn't always be obvious that you were doing it wrong. (For example: limiting the IP of your reverse proxy rather than the end user, or the inverse: blindly accepting any X-Forwarded-For header, including those potentially set by a malicious user.) A couple years back, I spent some time adding in runtime checks that detect those kinds of issues and log a warning. Since then, we've had a significant reduction in the amount of not-a-bug reports and, I assume, significantly fewer users with incorrect configurations.
deepsun|1 year ago
sandeepkd|1 year ago
The API was designed to generate a hash for a password (knowledge factor) and for performance and practical reasons a limit has been picked up (72). The chances that some one knows your first 72 characters of password implies that the probably is a lot higher for the abuser to have remaining characters too.
While smaller mistake here in my opinion was not knowing the full implementation details of a library, the bigger mistake was trying to use the library to generate hash of publicly available/visible information
hansvm|1 year ago
There's nothing wrong with a limit. The problem is that the library silently does the wrong thing when the limit is breached, rather than failing loudly.
AlfeG|1 year ago
renewiltord|1 year ago
As a mistake, it's fine. Everyone writes up things like that. But defending it as an affirmatively good decision is wild.
benced|1 year ago
veqq|1 year ago
``` (defmethod derive-key ((kdf bcrypt) passphrase salt iteration-count key-length) (declare (type (simple-array (unsigned-byte 8) (*)) passphrase salt)) (unless (<= (length passphrase) 72) (error 'ironclad-error :format-control "PASSPHRASE must be at most 72 bytes long."))...) ```
ww520|1 year ago
Also I'm not sure what functionality the authentication cache provides, but their use of bcrypt(userId + username + password) implies the password is kept around somewhere, which is not the best practice.
OT. Has Argon2 basically overtaken Bcrypt in password hashing in recent years?
buzer|1 year ago
That depends on how exactly it was used. If it was simply used to check if previous authentication was successful (without the value containing information who it was successful for) then single long password could be used to authenticate as anyone.
jedisct1|1 year ago
n0rdy|1 year ago
It's great to hear that Zig covered both cases. However, I'd still prefer the opposite behavior: a safe (without truncation) default `bcrypt()` and the unsafe function with the explicit name `bcryptWithTruncation()`.
My opinion is based on the assumption that the majority of the users will go with the `bcrypt()` option. Having AI "helpers" might make this statistic even worse.
Do you happen to know Zig team's reasoning behind this design choice? I'm really curious.
masklinn|1 year ago
Also neither seems to warn about the NUL issue.
[0] I assume for compatibility purpose, but it still seems very dubious.
IshKebab|1 year ago
If you don't see the mistake, Google 'yaml.safe_load'.
coolgoose|1 year ago
Is there a reason I might be missing?
stavros|1 year ago
Zamicol|1 year ago
Hash functions themselves are general purpose and don't protect against low entropy inputs (low entropy passwords). They also don't protect against rainbow tables (pre-calculated digests for common or popular passwords). For password hashing you want something slow and something with unique entropy for each user's password to prevent rainbow attacks.
It doesn't solve the problem of weak passwords, but it's the best that can be done with weak passwords. The only improvement is to enforce strong passwords.
n0rdy|1 year ago
> SHA-2 family of hashes was designed to be fast. BCrypt was designed to be slow.
Slow == harder to brute-force == more secure.
thekemkid|1 year ago
The node crypto module is essentially an API that offloads crypto work to OpenSSL. If we dig into OpenSSL, they won't support bcrypt. Bcrypt won't be supported by OpenSSL because of reasons to do with standardisation. https://github.com/openssl/openssl/issues/5323
Since bcrypt is not a "standardised" algorithm, it makes me wonder why Okta used it, at all?
I remember in uni studying cryptography for application development and even then, back in 2013, it was used and recommended, but not standardised. it says a lot that 12 years on it still hasn't been.
whalesalad|1 year ago
progmetaldev|1 year ago
ack_complete|1 year ago
Similarly, a developer I worked with once claimed that CRC32 was sufficient verification because CRC32s changed so drastically depending on the data that they were difficult to forge. He was surprised to find out not only is it trivial to update a CRC32, but also to determine the CRC polynomial itself from very few samples.
bux93|1 year ago
bawolff|1 year ago
[Im assuming the usual definition of salt where it is known by the attacker... a pepper would be fine]
Tostino|1 year ago
How does a company whose only job is security screw that up so badly?
eadmund|1 year ago
One of the points of the article is that documentation isn’t enough: one cannot allow callers to misuse one’s API.
n0rdy|1 year ago
While I don't have any answers to this, I've realized that it's an ideal showcase of why fuzzy testing is useful.
unethical_ban|1 year ago
On the other hand, I'm not a security developer at Okta.
CJefferson|1 year ago
jmuguy|1 year ago
https://gist.github.com/neontuna/dffd0452d09a0861106c0a46669...
underdeserver|1 year ago
kmarc|1 year ago
Funnily, TFA later also suggests that such function should exist...
a-dub|1 year ago
amusingly, the python "library" is just a thin wrapper around the same rust library.
protip: a lot of cryptography primitives actually aren't that complicated in terms of the code itself (and often can be quite elegant, compact and pleasing to the eye). if it's important, probably worth just reading it.
it's what people wrap them with or the systems they build that get messy!
mariocesar|1 year ago
About solutions, Django hashes by default the password only with a salt. I'm not sure why it would be valuable to combine user_id+username+password. I've always assumed that using salt+password was the best practice.
mariocesar|1 year ago
sscarduzio|1 year ago
throwaway-9111|1 year ago
Is there any security issues with that? I'm a "newb" in this area, so I'm genuinely curious about the flaws with the naive approach
SebFender|1 year ago
bawolff|1 year ago
Weird take. Usernames are often chosen by the user. Less so in corporate world but definitely not unheard of
SebFender|1 year ago
Lvl999Noob|1 year ago
pjc50|1 year ago
- MUST be non-reversible, including against tricks like "rainbow tables"
- should be somewhat expensive to discourage just trying all possible passwords against a (leaked) hash
KDF is a key derivation function. The value will be used as a key in, say, AES. The important properties are:
- should distribute entropy as well as possible, across the required width of output bits
- reversibility less important as the derived key shouldn't be stored anywhere
- may or may not want artificially inflated cost to discourage cracking
fabbari|1 year ago
zero_k|1 year ago
There's either (1) nobody competent enough there to know (which is likely not true, I had a pentester friend recently join, and she is very good), or, more likely (2) management doesn't care and/or doesn't give enough authority to IT security personnel.
As long as clients don't have any better options, Okta will stay this way.
BrandoElFollito|1 year ago
Yes, this is very true.
Also some companies realize that they can screw up royally because they do no have the proper knowledge, and authentication is not a core business of theirs.
I can understand them. I also use mail systems I am not that happy with, but I have this comforting idea that if they have a problem, 3B people are waiting together with me for it to be solved, and that's the kind of pressure that helps.
withinboredom|1 year ago
duskwuff|1 year ago
(It's still got the truncates-at-72 problem with PASSWORD_BCRYPT, though.)
philippta|1 year ago
Tade0|1 year ago
ReptileMan|1 year ago
llmthrow102|1 year ago
throwaway984393|1 year ago
[deleted]
nabla9|1 year ago
Don't conceive your own cryptographic hacks. Use existing KDF designed by professionals.
ludwik|1 year ago
acdha|1 year ago
edoceo|1 year ago