Don't pass on small block ciphers

cyberax|13 days ago

Small block ciphers are great for some use-cases!

32-bit block ciphers are a good way to create short opaque IDs because they provide a bijection between two sets of integers. And even if your ID is slightly shorter than 32-bit you can easily shave off a few bits with cycle walking: https://en.wikipedia.org/wiki/Format-preserving_encryption#F... E.g. if you want to make sure your IDs can be mapped into 31/63 bits.

I especially like the RC-5 cipher for these kinds of uses. It can be implemented in just a few lines of code and there are standard test vectors for it.

adrian_b|13 days ago

The RC-5 cipher was very nice for its day, but I am certain that it is much slower than AES on any modern CPU, with the exception of microcontrollers, where nonetheless other solutions, e.g. ChaCha20, may be faster.

AES also needs only a handful of lines of code for its implementation (using assembly). For such an application, you can even reduce the number of rounds of AES-128, e.g. from 10 to 4.

When you want truly uniform random numbers, then encrypting with AES-128, then truncating, is best. If you want invertible encryption, then you should encrypt a counter and either use a 32-bit addition or a 32-bit XOR for encrypting the 32-bit number. With a single AES-128 invocation for generating a random mask, you can encrypt four 32-bit numbers.

Of course, when speed does not matter, you can use pretty much any of the historical block ciphers, because the security requirements for encrypting 32-bit numbers are very low, since they are easier to find by brute force searching than by attempting to break any kind of encryption.

jcalvinowens|13 days ago

Funny your example is rc5, I wrote exactly what you describe to generate 32-bit cookies in a random prototype a few years ago: https://github.com/jcalvinowens/sdvr/blob/main/rc5.c

It is cute, but surely there's a more efficient way than RC5? There are bijective hash functions which are much cheaper (murmur, at least).

mananaysiempre|13 days ago

What symmetric cryptography is there that would be reasonable on a small 8-bitter? This means

- As little code as possible;

- As little constant data as possible;

- Little to no shifts by amounts not divisible by 8, as there may not be a barrel shifter even for bytes;

- No shifts by variable amounts, including as a space-saving technique, for the same reason;

- No multiplies beyond 16×16 bits, and preferably none at all, as there may not be a multiplier.

Speck, mentioned in TFA, fits this very well. None of the things that came out of eSTREAM or the NIST lightweight cryptography competition even qualify, as far as I can tell, as the “lightweight” part is very keen on things that are easy in hardware but hard (slow, space-hungry, or both) in software. Gimli exists but is kind of chonky. So is Speck truly it? Is just noöne interested in the problem?

adrian_b|13 days ago

ChaCha20 satisfies your conditions.

The only disadvantage of ChaCha20 vs. Speck is a bigger state, you need 128 bytes for it (64 bytes of state + 64 bytes for the intermediate computations), but that is not likely to be a problem, except in the smallest microcontrollers.

The bigger state of ChaCha20 is determined by higher security requirements. The advantage of ChaCha20 is that it is supported by standard protocols, e.g. TLS 1.3 and SSH.

The standard protocols mentioned above include ChaCha20 precisely for the case of communication with smaller or older CPUs, which do not have hardware AES support.

AlotOfReading|13 days ago

I agree with the article, but I think it could go farther. Instead of having primitives for every 32/48/64/122 bit block, we need good format-preserving encryption. Then all of this advice boils down to "use as many bits as you need" and we can keep using the standard primitives with hardware support. If you need more security in the future, you only need to decrypt and reencrypt with the new size.

Dylan16807|13 days ago

Small sizes have to be used with extra care, so I wouldn't want to make a generic function for all sizes. For bigger sizes we already have nice functions that take care of everything.

bflesch|13 days ago

Are you suggesting a very large custom blocksize? I don't think this would be feasible beyond a few megabytes.

tptacek|12 days ago

A lot of the lightweight cipher justification in this post seems like it overlaps a lot with Format Preserving Cryptography, which uses (generally) more conventional symmetric primitives (16-byte-block ciphers, for instance) to handle encryption with small domains:

https://eprint.iacr.org/2009/251.pdf

throw0101c|12 days ago

Any reason not to use Ascon, which not only got Official Status™ from NIST:

* https://www.nist.gov/news-events/news/2023/02/nist-selects-l...

* https://csrc.nist.gov/pubs/sp/800/232/final

But was also a lightweight finalist in CAESAR (along with ACORN):

* https://en.wikipedia.org/wiki/CAESAR_Competition

* https://en.wikipedia.org/wiki/Ascon_(cipher)

201984|12 days ago

Ascon is a stream-oriented AEAD, not a block cipher, and it requires a nonce. Because of this, it would not work for the usecases in TFA, not to mention it's also quite a bit slower than Speck.

PunchyHamster|13 days ago

Nowadays even many small microcontrollers get AES acceleration so I don't see much reason

chowells|13 days ago

Basically all of the use cases in the article don't make sense with AES. That's not because it's AES. That's because its blocks are significantly larger than the data you want to protect. That's the point the article was making: in very specific circumstances, there is practical value in having the cipher output be small.

avidiax|13 days ago

If you want to encrypt a serial number, you don't want the output to be 256 bits.

kpcyrd|12 days ago

The ch32v003 implements RISC-V without the M extension, meaning there's not even a MUL/DIV instruction.

Out of all micro controllers I've worked with, only a single one had AES cpu instructions.

whizzter|13 days ago

Not a cryptographer but I'm not liking the "advice" of encrypting the first 64bits of the UUID.

An user of an opensource application using this known "encryption" will be able to approximate the real UUID values based on creation time of objects they control and then would probably be able to approximate keys for 64bit encryption (although I guess one could design a cipher with a far larger key than block size, but it'd be a NIH design with all their pitfalls).

But looking at it sanely, UUIDv7 isn't perfect and no reason really not to "encrypt" the entire UUID with AES instead (often built into hardware anyhow) instead of just the first part.

FiloSottile|12 days ago

All of these small block ciphers have regularly large keys.

doomrobo|13 days ago

>Small block ciphers are thus generally a bad idea against active adversaries.

>However, they can be very useful against passive adversaries whose capability is limited to observing identifiers, who are then unable to map them to the original value.

Really? Isn’t the Sweet32[0] attack mostly passive? “We show that a network attacker who can monitor a long-lived Triple-DES HTTPS connection between a web browser and a website can recover secure HTTP cookies by capturing around 785 GB of traffic.”

[0] https://sweet32.info/

Joker_vD|13 days ago

...a long-lived HTTPS connection that manages to transfer >700 GiB of traffic, with no disconnects, and presumably has re-keying disabled? An interesting theoretical setup, I guess.

bflesch|13 days ago

Slightly unrelated, but aren't these AES-specific custom CPU instructions just a way to easily collect the encryption keys? There is a speedup but is it worth the risks?

If I were a nation state actor, I'd just store the encryption keys supplied to the AES CPU instruction somewhere and in case the data needs to be accessed you just read the stored keys.

No need to waste time deploying a backdoored CPU firmware and wait for days or weeks, and then touch the hardware a second time to extract the information.

When all AES encryption keys are already stored somewhere on the CPU, you can easily do a drive-by readout at any point in time.

Linux kernel has a compile time flag to disable use of custom CPU instructions for encryption, but it can't be disabled at runtime. If "software encryption" is used, the nation state actor needs to physically access the device at least two times or use a network-based exploit which could be logged.

adrian_b|13 days ago

There are serious risks about backdoors in CPUs, but they are not about the CPU gathering the AES keys.

The storage required for this would be humongous and the CPU cannot know for which data the keys have been used. Moreover this would too easily be defeated, because even if the AES instructions allow to specify a derived round key in them, you can always decline to do this and use a separate XOR instruction for combining the round keys with the intermediate states. Detecting such a use would be too difficult.

No, there is no base for fearing that the AES keys can be stored in CPUs (on the other hand you should fear that if you store keys in a TPM, they might never be erased, even if you demand this). The greatest possible danger of adversarial behavior of a CPU exists in the laptop CPUs with integrated WiFi interfaces made by Intel. Unless you disconnect the WiFi antennas, it is impossible to be certain that the remote management feature of the WiFi interface is really disabled, preventing an attacker to take control of the laptop in a manner that cannot be detected by the operating system. The next danger by importance is in the computers that have Ethernet interfaces with the ability to do remote management, where again it is impossible to be certain that this feature is disabled. (A workaround for the case when you connect such a computer to an untrusted network, e.g. directly to the Internet, is to use a USB Ethernet interface.)

Aachen|13 days ago

I am not a chip designer but from my limited understanding, this "somewhere" is the problem. You can have secret memory somewhere that isn't noticed by analysts, but can it remain secret if it is as big as half the cpu? A quarter? How much storage can you fit in that die space? How many AES keys do you handle per day? Per hour of browsing HN with AES TLS ciphers? (Literally all supported ciphers by HN involve AES)

We use memory-hard algorithms for password storage because memory is more expensive than compute. More specifically, it's die area that is costly, but at least the authors of Argon2 seem to equate the two. (If that's not correct, I based a stackoverflow post or two on that paper so please let me know.) It sounds to me like it's easily visible to a microscope when there's another storage area as large as the L1 cache (which can hold a few thousand keys at most... how to decide which ones to keep)

Of course, the cpu is theoretically omnipotent within your hardware. It can read the RAM and see "ah, you're running pgp.exe, let me store this key", but then you could say the same for any key that your cpu handles (also rsa or anything not using special cpu instructions)

wat10000|13 days ago

I don't imagine it would be too difficult to snoop the instruction stream to identify a software implementation of AES and yoink the keys from it, at least if the implementation isn't obfuscated. If your threat model includes an adversarial CPU then you probably need to at least obfuscate your implementation, if not entirely offload the crypto to somewhere you trust.

63 comments