Web's random numbers are too weak, researchers warn

[+] tptacek|10 years ago|reply

This presentation makes very little sense to me.

It appears to revolve around the idea that Linux servers "produce" entropy at an unexpectedly low rate over time, and "consume" entropy quickly.

But that's not how CSPRNGs work. A CSPRNG is a stream cipher, keyed by whatever entropy is available to seed it at boot, and, for purposes of forward secrecy, periodically (or continually) rekeyed by more entropy.

Just as for all intents and purposes AES-CTR never "runs out" of AES key, a CSPRNG doesn't "run out of" or "deplete" entropy. The entire job of a CSPRNG, like that of a stream cipher, is to take a very small key and stretch it out into a gigantic keystream. That's a very solved problem in cryptography.

I am automatically wary of anyone who discusses "entropy depletion", even moreso when they appear to be selling "entropy generators".

[+] JoshTriplett|10 years ago|reply

So you're saying that Linux's /dev/random should go away, /dev/urandom or getrandom is always what you want as long as the system has collected enough entropy (even for key material), and Linux could stop "collecting entropy" once it has enough to seed the CSPRNG?

[+] aidenn0|10 years ago|reply

Perhaps you can clear something up: my understanding is that linux /dev/urandom will drain the entropy pool first and then use a CSPRNG when it is empty.

This is what I was told back in the 2.6 days, and it always seemed to me that you would want to stretch your entropy out more.

[+] __Joker|10 years ago|reply

The blackhat whitepaper[1] and presentation[2] has more info, which might appetite curious HNers.

[1] https://www.blackhat.com/docs/us-15/materials/us-15-Potter-U...

[2] https://www.blackhat.com/docs/us-15/materials/us-15-Potter-U...

[+] qrmn|10 years ago|reply

Honestly, this sales brochure of a "paper" tastes even worse than the BBC fluff piece. This is below the standard of paper I would have expected Black Hat to accept.

Good CSPRNG design is not a "dark art", and entropy is not "consumed" when a good CSPRNG is used. Any good CSPRNG uses a good PRF - any good block cipher in CTR mode, a hash, or a HMAC, perhaps - to stretch one good, solid, 256-bit entropy seed into as much cryptographically-secure random data as you'll ever need over the lifetime of your cryptosystem, and ratchets forward through the PRF after each call so the state cannot later be reversed (without breaking the PRF, but you're using a good one, so you'll be fine).

You need quality entropy to seed a CSPRNG - not quantity. Yes, it is, as we all know, very important you don't try to use a CSPRNG before its initial seed has collected enough good entropy - which is, yes, a particular problem in embedded systems or headless servers - but after that, the entropy in your CSPRNG seed isn't something that magically disappears, as you'll see from the design of Linux's newest random-number API, getrandom, patterned after the OpenBSD folks' ideas.

Reseeding a CSPRNG's state with more entropy is not a benefit, but in fact a potential risk every time you do it: it can result in entropy injection attacks if an attacker can observe the state, and control some of the entropy. That, in turn, could break your whole cryptosystem, especially if you're using fragile primitives like DSA or ECDSA. One source: http://blog.cr.yp.to/20140205-entropy.html

Detuned ring oscillator TRNGs [p2] can be troublesome to protect from RF side-channel attacks, or even RF injection attacks in pathological cases. Carefully used, they are fine, but best used when combined with shot-noise-based TRNGs. You can find those in surprising places: even the Raspberry Pi's BCM2835/BCM2836 has a perfectly serviceable one, available from /dev/hwrng after bcm2708-rng/bcm2835-rng has been loaded, and which rngd can use with no trouble.

Forgive me if, therefore, I perhaps wouldn't like to buy a "quantum random number generator" from Allied Minds Federal Innovations, Inc, who are behind this "paper", or to replace the OpenSSL RNG with theirs. That all feels far too much like a black box, and Dual_EC_DRBG is still fresh in our memory. I'd rather use the one Alyssa Rowan described to me on a napkin, thanks, or LibreSSL's/OpenBSD's arc4random with ChaCha20, or CTR_DRBG, or HMAC_DRBG.

[+] pedrocr|10 years ago|reply

Going through these it seems they've gone to a whole lot of trouble to implement something no one really needs (an extra layer for entropy management). Their reasoning seems to be:

1) OpenSSL seeds its CSPRNG once on startup from /dev/urandom

2) A Linux server will often have low entropy when responding to that call

3) The OpenSSL CSPRNG is thus compromised and some extra logic is needed to only take the value once there's enough entropy

The thing is 2) is not really a problem. As long as the pool had enough entropy at the beginning to seed the /dev/urandom CSPRNG the output is good forever even if the pool is now empty. I think most distros already make sure /dev/urandom is properly seeded on startup so there should be no real attack here.

On 3) we should probably be going the other way (less code). Apparently OpenSSL actually has its own CSPRNG instead of just reading from /dev/urandom when it needs random numbers. Maybe there's a valid performance reason to do that (less context switches) but I doubt it.

[+] rsy96|10 years ago|reply

The definition of cryptographic pseudorandom generators are deterministic functions that turn a true random number (seed) of bit length k into a stream of seemingly random numbers of bit length n, where n is much larger than k. A computationally bounded attacker cannot distinguish this stream of pseudo random numbers from true random numbers, unless he/she knows the seed.

So if the CSPRNG are truly cryptographic secure, you don't need constant stream of high entropy input. You only need enough starting entropy, say 256 bits, and you will be fine for a long time.

[+] pilsetnieks|10 years ago|reply

Is this really the kind of article we want here?

> A study found shortcomings in the generation of the random numbers used to scramble or encrypt data.

> The hard-to-guess numbers are vital to many security measures that prevent data theft.

> But the sources of data that some computers call on to generate these numbers often run dry.

> This, they warned, could mean random numbers are more susceptible to well-known attacks that leave personal data vulnerable.

I get that you have to simplify for the ordinary people but this looks like talking to a five-year-old.

[+] colinbartlett|10 years ago|reply

Funny, I came to the comments to remark about how nice it was to see an article very clearly articulate the issue.

Yes, I generally understand these concepts, but I am not a security professional and found the explanation useful both for my own comprehension and for improving my ability to relay complex technical topics to non technical people.

[+] amouat|10 years ago|reply

I submitted it, and that thought did cross my mind.

However, I see someone has linked the papers with the original research in a comment which allows anyone curious to dig into the actual details.

At any rate, it's always interesting to see how the mainstream media cover computing/hacking news.

[+] davnicwil|10 years ago|reply

I guess you're not familiar with the BBC's tech journalism?

TLDR: It's interesting to HN readers only on the meta level (how does a non-technical readership consume tech news in 2015).

BBC tech news makes very strange reading for someone in the tech world. All the technical terms we use to be precise about things are replaced with broad, ostensibly more understandable, terms that do often sound very childish in comparison - sometimes so much so that they do actually obscure the meaning and even the truthfulness of the message.

As such it can make for frustrating reading, only because you wish there was a better way to communicate these ideas that doesn't mislead people or dumb down the information so much that it's basically without value.

There almost certainly is, to be honest, but it's very difficult, and so probably an unwise investment of effort for the likes of the BBC, whose readers are mainly there to read 'general' news stories and might skim a couple of tech stories as an aside.

It's probably much like a Physics PhD going back to science classes in school - "I see what you're trying to say here, and I perhaps can see why you're saying it in the way you are, to introduce a concept in a way you think is most accessible, but what you're saying isn't actually fully correct, and I find it irritating that you're implying it is".

[+] unknown|10 years ago|reply

[deleted]

[+] davidgerard|10 years ago|reply

It's relevant and useful to know that this is now a matter of mainstream concern.

[+] sarciszewski|10 years ago|reply

Uh oh, is this another Debian bug or the underpinnings of a fundamental weakness in Linux's CSPRNG?

I wonder if OpenBSD's arc4random_buf() is unaffected?

cc 'tptacek :)

[+] kaesve|10 years ago|reply

As far as I could see, it's not a problem in the CSPRNG itself, but in how it is used. More specifically it seems like a lot of applications use more entropy bits per second than servers generate by normal use. I'd say this is the result of not understanding how CSPRNG works and how to use it safely. Adding more or better sources of entropy to your systems would solve this.

[+] tedunangst|10 years ago|reply

OpenBSD does not allow entropy to be "used up".

[+] atoponce|10 years ago|reply

This paper asserts something innacurate: entropy pools can be "used". Entropy is not an object, it's an estimation. Just as you don't use temperature, or barometric pressure, you don't "use" entropy.

Further, once a CSPRNG is properly seeded, there is no need to concern yourself with whether or not it can produce "high quality random numbers", provided the cryptographic primitive behind the CSPRNG contains conservative security margins. The Linux kernel CSPRNG uses SHA-1 as the underlying primitive. While SHA-1 is showing weaknesses in blind collision attacks, it still contains wide security margins for preimage attacks, which is what would need to be applied to a key generated by a CSPRNG (you have the output, now predict the state that produced it). Even MD5 remains secure against preimage and second preimage attacks.

Again, once properly seeded, the Linux CSPRNG can produce data indistinguishable from true random indefinitely until SHA-1 is sufficiently broken.

[+] praseodym|10 years ago|reply

I have a VM running Debian Jessie (Linux 3.16) that has very low entropy available (cat /proc/sys/kernel/random/entropy_avail returns <200 most of the time) even though the Intel RDRAND instruction is available. Shouldn't it be using that to fill up the entropy pool, or am I misunderstanding how the entropy_avail value works?

[+] Freaky|10 years ago|reply

Linux is very conservative with how much entropy it credits to things like RDRAND since they can't be easily trusted. Looks like you get one extra bit per interrupt:

https://github.com/torvalds/linux/blob/master/drivers/char/r...

You'll note no other use of arch_get_random_* throws anything at credit_entropy_bits().

Linux 3.17 did bump up the assumed quality of virtio-rnd:

https://github.com/torvalds/linux/commit/34679ec7a0c45da8161...

[+] mukyu|10 years ago|reply

The talk about needing to constantly add more entropy or 'manage' it is nonsense. djb says it best: http://blog.cr.yp.to/20140205-entropy.html

Briefly, once you have say 256 random bits it is trivial to use AES and CTR mode and turn that into 2^71 random bits until you need to rekey. If you cannot get more entropy in the time it takes to use up all of those numbers something is completely broken. The only problem you can have is not having enough entropy to bootstrap (such as VMs or needing to generate a key at poweron on an embedded device), but this paper gives little more than lipservice to it.

[+] im3w1l|10 years ago|reply

How big problem is this in practice? Let's say you only have 256 bits of "real entropy" and you then stretch that into large amounts of pseudo-random bits using a state of the art CSPRNG and use those bits for all your randomness needs. Let's say worst case scenario here, so a server that is online for several years, with no reseeding at all. Are there any practical attacks against that?

[+] marcosdumay|10 years ago|reply

If you stretch it into a set of numbers with 256 bits or less, you are good. If you expect to generate bigger random numbers from it, you have a problem.

But the pool does not stay with only 256 bits for long (if at all). It's always accumulating more.

Anyway, if the pool ever get to zero, it means that an attacker with infinite resources that can see the entire sequence generated by the CSPRNG could predict the next numbers it'll generate. On practice none of those conditions are met.

[+] mangeletti|10 years ago|reply

It would be interesting to get Bruce Scheier's take on this.

He wrote this in 99, and it talks about key length near the bottom (though doesn't cover this exact scenario): https://www.schneier.com/crypto-gram/archives/1999/0215.html

47 comments