How do computers generate random numbers?

[+] betwixthewires|4 years ago|reply

It's a good introductory write up, but there are 3 things about it that frustrated me to no end.

> "As a human, I can do this very easily. 100101011010010110001101 There, I just did it.

No, you didn't. That number is most certainly not random, there are biases in it, you just don't know there are. Your mind is not random. This is why we use dice and not people to generate random bits.

> What do you mean by “kind of random number”? Aren’t all random numbers the same. Not really. There are two primary types of random number generators.

I growled audibly at this one. Random numbers should all be the same. In quality, not quantity, of course. There should be no discernible difference. There is one kind of random number, only different types of generators, with a PRNG if the seed is provably destroyed there should be absolutely no way to distinguish between a number generated by a PRNG and a TRNG.

> The computer hardware isn’t the only source of entropy. The user’s own mouse and keyboard movements can be used as well.

No. These movements are not random. Similar to my first gripe, your brain is not random. We used to use this approach to generate entropy and now we don't because this is understood, "random" user inputs should absolutely never be used to generate randomness in anything security related.

> Despite the benefits of CSPRNGs, as with everything else in the tech industry, security can never be guaranteed.

I'm glad you pointed this out. Everything in cryptography is based on unproven axioms, this is an important point that people should understand, it could be that P=NP, it could be that an algorithm exists to factor numbers to primes, we think not but really we don't know.

[+] Jerrrry|4 years ago|reply

User inputs are not random but they certainly can be entropic.

It is crucial to inform the user of this, as software has.

"Generating random numbers, please move your mouse around randomly" is suitable for literally all use cases.

[+] caylus|4 years ago|reply

> That number is most certainly not random, there are biases in it, you just don't know there are.

Indeed. In that specific example, the author generated 24 binary digits, but only on 7 occasions was the digit the same as the previous digit. There were 23 opportunities to do so, and in a truly random sequence of digits the chance of any digit being the same as the previous one is 50%.

Applying the binomial distribution, the probability of observing 7 or fewer repeats with 23 trials and a 50% chance on each trial is only 4%. So with p = 0.04, we can reject the null hypothesis that the author's generated sequence is random, and conclude that it is biased.

There are online demos where you can try this yourself: https://people.ischool.berkeley.edu/~nick/aaronson-oracle/in...

[+] sunny--tech|4 years ago|reply

Yes I'm aware that there's an ongoing debate about whether a human, or anything for that matter, can truly be random. But I wasn't going to bring this up in this article, or add caveats to every single point, or else this would've been unbearable to read.

I used artistic license in certain instances. But I agree some things could've been worded better and I'll edit those.

[+] dllthomas|4 years ago|reply

> it could be that P=NP, it could be that an algorithm exists to factor numbers to primes

Nits but 1) even if P=NP, it is possible that the problems we've chosen have a lower bound on their polynomial-time algorithms that still leaves them secure for the indefinite future (O(n^Graham's Number) is polynomial). Of course, we also don't have any proof of that so your general point stands. Related, 2) an algorithm certainly exists to find prime factors - quite a few actually! - but none that we know of in polynomial time without quantum computing.

[+] SilasX|4 years ago|reply

>No, you didn't. That number is most certainly not random, there are biases in it, you just don't know there are.

Isn't that true of CSPRNGs as well, though? One of them may have some huge flaw that allows it to be predicted, we just don't know what it is yet.

(I think you basically agree with this in your last paragraph?)

More generally, when we ask for a random number, what we're usually looking for is a number uncorrelated with some other number. For example, when you're doing an experiment, the purpose of the randomization is to ensure that you're not really capturing some other effect that was setting which parameters were placed together.

That is, if you're deciding to run the control vs test case because of subtle fluctuations in the room's lighting (that you're unaware are influencing you), then you may have no effect at all, but you accidentally pick up an effect which was really due to the lighting, not to the thing you're trying to test. Using a dice roll prevents this, not because of randomization per se, but because it reliably breaks other correlations that might seep into the experiment.

To the extent that a human randomizer meets that in one particular domain, it may be good enough (not that you were saying otherwise). It's just ... you'd rather not bet on that being true in any given case.

[+] mindvirus|4 years ago|reply

I don't know what is actually done, but I would think you can do things like milliseconds between keystrokes, or distance mouse moved modulo 2 to generate randomness. That said I agree with your point humans are not random.

[+] catlifeonmars|4 years ago|reply

Are you using “random” to mean “uniformly random” and not (e.g) “beta-distributed random”?

[+] postalrat|4 years ago|reply

> No, you didn't. That number is most certainly not random, there are biases in it, you just don't know there are. Your mind is not random. This is why we use dice and not people to generate random bits.

Can't you say that any single (shortish) number is random? How you can demonstrate 99999999999999999999 isn't random?

[+] xkeysc0re|4 years ago|reply

Writing your own random number generator can be a lot of fun. Lots of sources of entropy out there. Inspired by lavarand[0], I wrote an RNG in Python based on the output of the Global Conscousness Dot[1] (which is a ridiculous project in its own right). There's a lot of ways to visualize and ascertain how "random" your numbers are as well, whether plotting Pearson's with matplotlib or using a command line tool like ent[2] to calculate the degree of entropy.

[0] https://en.wikipedia.org/wiki/Lavarand

[1] https://gcpdot.com/

[2] https://manpages.ubuntu.com/manpages/bionic/man1/ent.1.html

[+] atty|4 years ago|reply

From the GCP project:

> The identification of events and the times at which they occur are specified case by case, as are the statistical recipes. The approach explicitly preserves some latitude of choice, as is appropriate for an experiment exploring new territory. Accepting loose criteria for event identification allows exploration of a variety of categories, while the specification of a rigorous, simple hypothesis test for each event in the formal series assures valid statistics.

I’ve never seen someone so blatantly spell out that they are cherry picking, but also then argue that the cherry picking is good science ;) this is the sort of thing that gives real scientists a bad name.

[+] pope_meat|4 years ago|reply

Hah, thanks for sharing this dot thing, what a bizarre little corner of the internet.

[+] Workaccount2|4 years ago|reply

I dug through the GCP dot page, and if I am understanding it correctly, their near-perfect RNGs have turned out to not be very random?

[+] Someone|4 years ago|reply

“The most common algorithms used for PRNGs are linear congruential generators”

I doubt that is still true for ‘modern’ languages. Let’s google a few.

https://docs.microsoft.com/en-us/dotnet/api/system.random?vi... says “The current implementation of the Random class is based on a modified version of Donald E. Knuth's subtractive random number generator algorithm”. So, that’s a no.

https://rust-random.github.io/book/guide-rngs.html doesn’t appear to mention multiplicative algorithms, either.

https://developer.apple.com/documentation/swift/systemrandom... says “SystemRandomNumberGenerator is automatically seeded, is safe to use in multiple threads, and uses a cryptographically secure algorithm whenever possible.”

[+] sunny--tech|4 years ago|reply

That's a good call out and I realize that was an error on my part. I meant to put in the Mersenne Twister but when I went back through my notes when I was writing, I misread that linear congruential generators were popular decades ago.

I've edited that section.

[+] SavantIdiot|4 years ago|reply

I've been working with the NIST 800-22 evaluation suite [1] and yes, it is very hard. There are an infinite number of tests to determine if a number is random. Ultimately it comes down to probability that it is a good RNG/PRNG and the application. Ironically, if a PRNG is good enough, it can often be BETTER than a random source (which is why RNGs have a conditioning phase).

[1] https://csrc.nist.gov/Projects/Random-Bit-Generation/Documen...

[+] panax|4 years ago|reply

An infinite number of tests yes, it is like trying to find the best possible predictor of a source. No statistical test can prove that a source has full entropy and uniformly distributed, only disprove it. These tests are too often misused and are usually not very useful since they tell us nothing about how much entropy might be in the sample, and entropy sources are always biased anyway and fail the tests, and if you condition them then they will always pass the test even if not random at all. A better method is to develop a stochastic model of the entropy source and attempt to estimate bounds on the entropy, which is still generally not feasible. The statistical tests can then be used as a sanity check to verify your entropy estimates. The RNG needs a conditioning phase, or entropy extractor to transform the output into a distribution that is indistinguishable from uniform as is needed in cryptographic applications and you should always include this stage because virtually no physical entropy source has a uniform distribution. The best thing to do is to try to gather as much entropy as you can from sources and gather maybe 10x what you think you need and then put it through an entropy extractor like a cryptographic hash function to generate a PRNG seed then use the PRNG.

[+] lscharen|4 years ago|reply

I'll take this opportunity to link to Luc Devroye's freely available book "Non-Uniform Random Variate Generation".

http://www.nrbook.com/devroye/

An undergraduate algorithms + statistics class is sufficient to get a lot out of this book, even if it's just exposure to the wide variety of techniques for generating random numbers on a computer.

[+] Borrible|4 years ago|reply

Use leftovers. They're all over the Universe.

https://www.researchgate.net/publication/283762433_The_Cosmi...

[+] simonblack|4 years ago|reply

Two basic ways:

Computed:

NOT truly random. Way back in the late 70s, I had a BASIC program that used to output the very same set of 8-digit 'random' numbers every time the program was used.

(Unless you did a special thing the first time, to obtain a random seed to generate a different sequence of pseudo-random numbers. IIRC, it was the number of machine cycles since the last time the floppy disk was accessed.)

Hardware:

Truly random: The machine counts machine cycles with a small maximum number before it overflows and restarts at zero. The machine looks at the time difference between things like key presses, or disk-spins, or something else that varies in time.

No matter how good you are the variation in time between your key-presses is never the same at the very small time-flow level. That number is used as a seed to a pseudo-random generator.

[+] dragontamer|4 years ago|reply

A modern "pseudo random number" is simply a sequence of numbers that visits the state-space in an order that's difficult to detect with modern statistical tests. (Chi-squared, among others). See PractRand or TestU01 as two packages for testing these sequences. That is to say: instead of the sequence "0, 1, 2, 3, 4, 5... 4294967296... 0", your RNG will do some other sequence.

Yes, "0, 3, 6, 9... 4294967295, 2, 5, 8... 4294967294, 1, 4, 7... 4294967293, 0, 3..." is a RNG of sorts, but an example of a really, really bad one that would instantly fail most statistical tests. :-) But conceptually, the Mersenne Twister, LCGRNG, and LSFR all accomplish this. Its a "reordering" of the sequence. That's it.

A "cryptographic random number" just adds a few additional tests to the pool of statistical tests. In particular: differential cryptography gets into the nitty gritty about which bits can predict the results of other bits. You have to assume that the "opponent" is willing to use extraordinary amounts of computing power to detect patterns.

If bit#25 has a 51% correlation with bit#30, you fail cryptographic random numbers. You need to be within 2^128 (at least) worth of security or more. That means a near 50% correlation (maybe 50.00000000001% is fine) between bits and future bits.

For example: the sequence: {AES(0, key), AES(1, key), AES(2, key)... AES(2^128, key), AES(0, key)...} is a cryptographically secure random number generator. The sequence will loop after its 128 bits of state are exhausted. If the "opponent" doesn't know the key, the bitwise correlations are cryptographically sound (thanks to the hard work of the engineers behind AES).

A true random number generator is just a cryptographic number generator applied to a truly random seed. White noise generators are a well known electronic-engineer trick: resistor noise is everywhere but is rather small (but you can build a white-noise generator from Johnson Nyquist noise if you really wanted). More likely, you use shot-noise from a transistor junction, at least at the hobbyist level.

Intel / AMD have true random number generators from some kind of electrical noise generator on every CPU, which feeds into cryptographic random number generators.

There are other sources of noise: radiation is a well known one but I'm not sure if they're practical.

There's a "speed limit" to white noise generators. You only can extract so much entropy from them in a given time (ex: Remember: CPUs operate at 4GHz, or 0.25 nanoseconds per clock tick). Cryptographic random number generators "stretch" the true seed of randomness, while the white-noise generator continues to grab more entropy.

[+] MauranKilom|4 years ago|reply

> Its a "reordering" of the sequence. That's it.

Technically a PRNG state sequence could also have distinct loops, returning a reordering of only a subset of the sequence for any given seed. For example, `return state += 2;` would return two separate sequences for odd or even seeds. We know that a PRNG with N bits of state must necessarily repeat after 2^N invocations, but it requires further proof that a given PRNG will not repeat at smaller intervals.

> For example: the sequence: {AES(0, key), AES(1, key), AES(2, key)... AES(2^128, key), AES(0, key)...} is a cryptographically secure random number generator. The sequence will loop after its 128 bits of state are exhausted.

In light of the above, is it possible to prove this (through some property of AES)?

(Also, we can't really prove that AES, or any other current algorithm, is cryptographically secure. Tomorrow, a new cryptanalysis technique could be revealed which breaks it.)

[+] SavantIdiot|4 years ago|reply

PRNGs can actually be better than true RNGs, which is why RNGs generally provide a seed for a DRBG. RNG conditioning has to do things like remove extremely long series of 1's or 0's, and RNGs often cannot generate enough material fast enough for today's demands (e.g, <1Kbps).

[+] rrauenza|4 years ago|reply

I'm trying to remember a technique based on listening to static or whitenoise and taking the least significant bit. You then take that stream of bits and postprocess it by looking for bit changes and don't use them directly.

That's a vague and I'm sure inaccurate description, but it isn't enough for me to google it ... anyone know what I'm referring to?

[+] chipuni|4 years ago|reply

That sounds a lot like what https://www.random.org/ does to create random numbers.

[+] mjreacher|4 years ago|reply

"Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin."

- John von Neumann

[+] hdivider|4 years ago|reply

PCIe and USB cards like this are available, generating TRNs using quantum optics:

https://www.idquantique.com/random-number-generation/product...

[+] anotherevan|4 years ago|reply

“The generation of random numbers is too important to be left to chance.” — Robert R. Coveyou

[+] qq4|4 years ago|reply

I always wanted a Geiger counter for experiments like this.

[+] dragontamer|4 years ago|reply

There's much cheaper, and easier, sources of white-noise.

Anyone actually interested in the electronics of this should build a white-noise generator out of an Op-Amp + your favorite PN junction in reverse-bias mode (diode, BJT transistor, or whatever).

Shot-noise from reverse-bias'd PN junctions is white noise at a quantum level. You're physically seeing the random electrons move across a junction that wasn't supposed to happen, and then amplifying those electrons up to levels we can detect (well... not our fingers to detect. But a fancy op-amp amplifier + arduino can detect).

https://www.maximintegrated.com/en/design/technical-document...

EDIT: This circuit from Maxim is reverse-breakdown noise from a Zener diode, which is more vigorous than shot-noise, and therefore easier to amplify. Its still white-noise and therefore "Truly random" up to the MHz. The circuit uses a Maxim voltage amplifier (I mean, the article is a big advertisement for how simple the MAX2650 is to use...)

[+] kerblang|4 years ago|reply

Did we ever resolve the flame war over /dev/random vs /dev/urandom? I recall puzzling over endless threads of no-you're-wrong

[+] comeonseriously|4 years ago|reply

Back in the day, IIRC, you could set a sound blaster to generate white noise, then use that as either random number or input to your RNG algo.

[+] dunefox|4 years ago|reply

I was asked to implement a random number generator in an interview not too long ago. This article should help...

[+] actually_a_dog|4 years ago|reply

I hope you were allowed to use some sort of reference material. Unless your previous work involved intimate familiarity with PRNGs, I would never expect anybody to be able to implement one on the spot, off the top of their heads. Even with reference material, I don't think I'd expect anything more involved than a linear congruential generator.

Without reference material, probably the best I could do would be to read from one of the /dev/*random devices. I'm guessing that whatever the actual intention behind that question is that writing a function that essentially reads from a file doesn't match the expectation of the interviewer. :/

https://en.wikipedia.org/wiki/Linear_congruential_generator

[+] FabHK|4 years ago|reply

Terrible interview question, imho.

[+] jason_s|4 years ago|reply

I lost interest at "The most common algorithms used for PRNGs are linear congruential generators."

[+] fnord77|4 years ago|reply

I thought this was a solved problem for at least a decade with CPU instructions like `RDRAND`

[+] sodality2|4 years ago|reply

It's preferred to use RDSEED and use that as a seed, instead of implicitly trusting the output of RDRAND. Mix it with user entropy (or system entropy).

Some say RDRAND is a backdoor (and after Dual EC, it would not particularly surprise me). This is why RDRAND doesn't exist in Linux (or, maybe isn't the only source of entropy?): https://www.linux-magazine.com/Online/News/Linus-Says-No-Bac...

Not sure the security implications of using poisoned seeds in addition to truly random seeds in an RNG. Some say that it's okay, because it cannot reduce security, only not increase it. But if the CPU RDRAND instruction is backdoored, couldn't the RNG instructions be intercepted and replaced so that RDRAND is the only seed? But, if your CPU is backdoored, why even bother with anything? etc, etc etc. This discussion could go on for a while.

[+] sunny--tech|4 years ago|reply

Link to bypass paywall: https://betterprogramming.pub/generating-random-numbers-is-a...

[+] unnouinceput|4 years ago|reply

It didn't

[+] abnry|4 years ago|reply

Obligatory XKCD: https://xkcd.com/221/

And also Dilbert: https://dilbert.com/search_results?terms=Random+Number+Gener...

78 comments