top | item 40654181

(no title)

jas- | 1 year ago

Great post! Thanks for taking the time to put this up.

What do you think the ratios are regarding improper use of nonce with this mode?

Most implementations that I am familiar with intentionally generate a random nonce to help lower the percentage of app devs doing this very thing

discuss

order

whs|1 year ago

My company need deterministic encryption to search encrypted data.

Turns out the people who wrote the in house Go library didn't have any idea. There is no non-deterministic encryption function because that might be too complicated for non-senior engineers (afterall they wrote most of the actual application) to correctly choose.

The first version use AES-CFB. There's no authentication. It's probably copy pasted from a public Gist and nobody ever commented on it that it is insecure. I wonder if it was actually intended to be the non-deterministic version, but the higher level wrappers do not wrap this function so people didn't actually use it.

The second version use AES-GCM with nonce derived from the key and AD. Since nobody understand why AD is needed, AD is always nil. Essentially there's ever one nonce.

I think the problem is that many senior engineers know that encryption use "AES" library but the Go standard library doesn't tell you how to use it securely.

Surprisingly this mistake also happen in our Java stack that was written by a different team. A senior engineer did notice and quietly moved away from the vulnerable version without telling the Go version.

I wrote a POC to decrypt data of the Go version, then wrote the third version, perhaps it will be open source soon. The new library only implement envelope key management, encrypted string wrapper and ORM integration. The rest is Google's Tink.

bawolff|1 year ago

> My company need deterministic encryption to search encrypted data.

I'll take things you should never do as a non-expert for $100.

> The first version use AES-CFB. There's no authentication. It's probably copy pasted from a public Gist and nobody ever commented on it that it is insecure. I wonder if it was actually intended to be the non-deterministic version, but the higher level wrappers do not wrap this function so people didn't actually use it.

Lack of authentication is probably the least of your concerns if your product is searching over encrypted data.

kbolino|1 year ago

You use the AD to authenticate additional information that doesn't need to be encrypted. For example, if you separately encrypted every record of a database, you could leave a non-sensitive identifier exposed along with each of them and validate it as the AD when decrypting. This would allow you to find specific records quickly assuming you also had an (encrypted) index or some prior knowledge. As with any case of leaving some data exposed, this can open up certain avenues of attack depending on the threat model. If the data can be tampered with, for example, this isn't a good idea since an attacker can corrupt your database (you'll know, but it will be unusable).

[Edit: I was unaware of the existence of "deterministic AEAD" before I wrote this: "Deterministic" encryption is discouraged because it passes through block-aligned patterns in the plaintext to the ciphertext. There is a simple method to do what you're after: it's just feeding your data (with padding) directly into the cipher (so-called ECB mode). Go's standard library gives you the raw AES cipher to do this with, but it doesn't expose the standard padding mechanisms (and it's not authenticated). You should be aware that doing anything like this leaves your data open to certain kinds of cryptanalysis that can infer the plaintext without directly breaking the cipher.]

I largely agree that the standard library doesn't provide any solid guidance or higher-level APIs for any use case other than TLS. The implementations seem to be pretty high-quality but you quickly go from "it's hard to use this wrong" in some libraries to "here's a drawer full of sharp knives" in others.

tadfisher|1 year ago

Correct. GCM is an improvement over ECB and CBC; it doesn't magically transform a symmetric algorithm into an asymmetric one. So most libraries are going to focus on the use cases where symmetric crypto makes sense, which are single-party scenarios such as disk storage. Google's Tink library, for example, completely hides the nonce parameter from its API.

unscaled|1 year ago

GCM is an improvement over CBC since it has authentication, but it does have a few weaknesses that CBC does not suffer from:

1. CBC does not have the same class of vulnerability to Nonce/IV reuse. Reusing an IV would leak some information about the first block (or first few blocks which are the same), but it would not give your a XOR of two plaintext or let you recover the keystream. On the other hand, CBC is vulnerable when IVs are predictable (e.g. the BEAST attack).

2. CBC with a proper encrypt-then-MAC scheme (e.g. HMAC-SHA256 + HKDF-SHA256 for generating Authentication and Encryption Keys) can encrypt more data than GCM without rotating a key. GCM with random nonces are particularly problematic, since at one point you would run into a nonce collision.

Overall, AES-GCM is preferable to AES-CBC because it is quite hard to implement a good encrypt-then-MAC scheme on top of AES-CBC unless you know what you're doing. But it's not good enough as a general worry-free solution, even when you're using a library to wrap nonce generation for you. What you want is XChaCha20Poly1305, if you're going for an ubiquitous and mature cipher.

denimnerd42|1 year ago

The problem with a random nonce is that most implementations also use a nonce of 12 bytes which under some use cases might not be enough before you repeat a nonce. So to remedy this they suggest using a counter but this could be hard to implement.

When I use AES-GCM I just use a bigger nonce and use a random one.

Last time I used AES-GCM I had a really hard time getting the person writing the other end to not re-use nonces.

imurray|1 year ago

> When I use AES-GCM I just use a bigger nonce and use a random one.

I don't think nonces bigger than 12 bytes will help. My quick reading of the AES-GCM spec is that when using a nonce that's not 96 bits (12 bytes), it is hashed to 96 bits. So either the nonce (called iv in the spec) is carefully constructed from a counter and set to exactly 96 bits, or the number of invocations is limited. The spec still restricts use of a key to 2^32 total uses for random nonces of any bigger length (resulting in a re-use probability of about 1e-10):

https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpubli...

gnabgib|1 year ago

I think the writer is @frereit, they submitted 2 days ago https://news.ycombinator.com/item?id=40623885

frereit|1 year ago

Yes, I am, but unfortunately I do not think I can provide any answers here. A quick internet search reveals some CVEs for nonce reuse.

If I had to, based on absolutely nothing but a gut feeling, guess, I'd think this may appear more frequently in IoT devices, where AES-GCM is attractive because of its speed, but randomness is sometimes in low supply?