When Random Isn't Random Enough: Lessons from an Online Poker Exploit

[+] aelaguiz|12 years ago|reply

As recent as 2010 we were finding major flaws in online poker security, here are a couple of videos I did of us sniffing hole cards out of the air because sites were lying about their use of SSL. They were using xOR encryption. Insane.

http://www.youtube.com/watch?v=4HBUe8Fb73Q http://www.youtube.com/watch?v=AAQDEXJdbQc

[+] SeanDav|12 years ago|reply

Ouch, I suppose the moral of the story is don't play poker for money using a wireless connection.

[+] tboneatx|12 years ago|reply

Yeah, I remember that. :)

[+] chops|12 years ago|reply

The solution here, which the article fails to mention, and which every security expert will undoubtedly tell you, is to make sure you use super random numbers (that's the technical term, for the layperson) by adding two random numbers together.

[+] InclinedPlane|12 years ago|reply

Hah, nice one. For people not getting the joke, adding two random numbers reduces the randomness and concentrates the results around a mean.

Interestingly, the perception that adding random numbers together results in even more random numbers is behind the popularity of the scam game "razzle". In razzle there's usually a board containing an array of depressions each lined up with a different value onto which is dropped a number of marbles (there are other ways to play as well, including dice). The important part is the scoring board. After each toss the values where all the marbles landed are added up and then a board is consulted to see how many "points" are scored from that value. The game is easy, get to 10 points and you win. However, there are two tricks. First, the scoring board is arranged in non sequential order. This is to conceal the fact that the group of middle numbers do not win any points. In actuality it is very difficult to win any points, since the probabilities are all concentrated in the middle. Second, because of the scattered nature of the scoring board it's very easy for the person running the game to cheat in your favor by "accidentally" giving you points when you shouldn't have earned them. The scam then works fairly simply. People pay money for each throw, and they are given the opportunity to win a high value prize. For the early throws the operator goes quickly and fudges the score lookups, building up points for the player that they haven't actually used, and giving them an unwarranted confidence in the game. After the player gets within a point or so of winning the operator then lets stops cheating and lets them play completely fairly on their own, at which point they have odds of worse than a thousand to one of winning (keeping in mind that it costs money for every throw).

[+] hueving|12 years ago|reply

No! You call a blocking rand function bound by available entropy.

If your random source is compromised, adding two numbers from the same broken source does nothing. What you can do though, is XOR numbers from independent random sources to improve the entropy of the final output. (not sure if that's what you meant by adding random numbers together)

[+] rlwolfcastle|12 years ago|reply

Strangely, the name of the submitters company is "Additive Analytics"

[+] unknown|12 years ago|reply

[deleted]

[+] comex|12 years ago|reply

I understand that "swap with entire deck" can't possibly be uniform because it has 52^n input possibilities, which is not divisible by 52! (and that the correct Fisher-Yates having 52! input possibilities and being able to generate every possible outcome is one way to prove that it is uniform). However, I'm not sure I can come up with an intuition for why any particular bias should exist, or why there is a discontinuity that makes it much more likely for a card to end up a short distance after its starting position:

http://en.wikipedia.org/wiki/File:Orderbias.png

Anyone have a good explanation?

[+] malisper|12 years ago|reply

I'm not entirely sure, but I have been able to figure a few things out.

By running some simulations by shuffling the range from 1 to n (with n going from 3 to 7), I found that at least one of the most common permutations had always started with 2. I'm unable to come up with a reason to explain this, but I was able to figure out something else interesting.

Imagining that the deck is vertical and going through each card and swapping it with a random card in the deck, the random card that has been swapped will never move further up the deck while the other card can still possibly move further up the deck. This implies that cards at the top of the deck should stay near the top and the cards at the bottom should stay near the bottom. I tested this by taking the sum of the sums of the first half of all the permutations and the sum of the sums of the second half of all the permutations and found that the total sum of the first halves was slightly smaller than the total sum of the second halves.

  n first halves | second halves
  3           53 |            54
  4         1265 |          1295
  5        18322 |         18976
  6       461683 |        498093
  7      9638931        10051128

So it seems that cards starting near the top are more likely to end up near the top and cards starting near the bottom are more likely to end up near the bottom.

Note: for odd numbers I threw out the middle number.

[+] sxyuan|12 years ago|reply

Here's my attempt at an intuitive explanation:

After the first swap, the distribution of numbers at the first position is uniform. What happens in the second swap? Well, the second position contains 1 with probability 1/n, and 2 with probability (n-1)/n. With probability 1/n, this position that is highly biased towards 2 (for n > 2) will get swapped with the first position. So 2 is more likely to be found in the first position.

Of course, it gets messy to carry this out for the n swaps and I still haven't really answered your question yet. To do that let's make the following assumption: after the ith swap, the distribution of numbers at position i is uniform. (I think you can prove this inductively but I haven't tried working it out - I was studying for a midterm before I got nerd-sniped.)

Before the ith swap, position i is biased towards number i. By our assumption, position i-1 is uniform. By making position i uniform we are essentially "smoothing out" the bias at position i to the other positions. Hence, as in the example I started with, it is more likely after swap i that position i-1 will contain number i, at the expense of the other numbers. This then reduces the likelihood of finding number k in positions 1, 2, etc. as opposed to position k-1: 2 is more likely in 1, 3 is more likely in 2, and so on.

[+] rfugger|12 years ago|reply

There's a decent explanation here:

http://www.codinghorror.com/blog/2007/12/the-danger-of-naive...

There's also a more complete story about the online poker exploit here:

http://www.cigital.com/papers/download/developer_gambling.ph...

[+] just2n|12 years ago|reply

It seems to me that the only major issue here is using a seed which can be trivially brute forced. Even if you don't look around the expected server time in order to guess the seed more quickly, 32 bits is really not hard at all to brute force these days.

I don't believe the number of bits the PRNG can generate is an issue here since we only need to uniformly get a number between 1 and 52, though what may be questionable is the cycle length of the PRNG if it weren't using an easily brute forced seed.

I'm not entirely convinced the off-by-1 is substantial, nor the fact that the shuffle produces duplicate shuffles (I can't intuit a significant bias, so I may well be wrong here).

So to summarize: never seed a PRNG with a small and easily brute forced value.

[+] MikeTV|12 years ago|reply

Direct link to the full article with a detailed explanation of the exploit: http://www.cigital.com/papers/download/developer_gambling.ph...

[+] rlwolfcastle|12 years ago|reply

Ignoring that some of the variables don't match up properly (the arrays: card and Card), it seems like the explanation of the first flaw may also be flawed.

Flaw #1: An Off-by-One Error

The algorithm above tries to iterate over each card in the deck, swapping each card with another randomly chosen card in the deck. However—every programmer has made this mistake before—there's an off-by-one error. The function random(n) returns a number between 0 and (n-1), not between 1 and n as the programmer intends. As a result, the algorithm will never swap the 52nd card with itself; the 52nd card can never end up in the 52nd place. So that is the first reason the "random" card shuffling isn't really random.

The comment refers to the Pascal code:

  random_number := random(51)+1;

If the programmer really thought that random was between 1 and n then the random_number variable would be a number between 2 and 52 (1+1 to 51+1). It seems like, instead, a better explanation is that they may have thought random(n) produced a random number between 0 and n, hence the need to increment by one. Another explanation is they just messed up the slicing using 51 instead of 52.

The point being that in the writer's explanation of the flaw they actually make the same mistake.

Funnily enough googling "pascal random" points to a stackoverflow article where the best answer makes the same error.

https://stackoverflow.com/questions/4965863/how-to-get-a-ran...

[+] atul_wired|12 years ago|reply

I think programmer tried to be defensive by ignoring an edge case. He either was lazy in searching for its documentation (<= 1999 you know) and tried to be over-smart to avoid "Out of Range" exception in test/production. Or he didn't consider looking at documentation like some of us do. He may have given it a shot by running it several times to see if it actually generates the number provided as an argument.

[+] DanBC|12 years ago|reply

This was an interesting article. (Font size is tiny using Chrome on iOS).

> If your business or technology depends on using random numbers, your best bet is to use a hardware random number generator.

Some hardware RNGs would be hopeless for this task. It'd be scary to have to buy one of these things and trust the output.

[+] notduncansmith|12 years ago|reply

News/YC is my favorite iOS HackerNews client, it's free and beautiful, and comes with Readability so I never run into this problem. So many sites are either not responsive or do it badly, so it's a lifesaver.

[+] PhantomGremlin|12 years ago|reply

I haven't seen this link posted yet http://www.idquantique.com/random-number-generators/products...

Note they claim: "QUANTIS has also been approved by national authorities and can be used for gaming applications."

If I were implementing this for a casino, I'd do what other posters have already suggested and use at least two independent hardware sources for my random numbers and XOR them together. IMO Intel's on-chip RNG would probably be a good source to use, but only in conjunction with others.

[+] alextingle|12 years ago|reply

> IMO Intel's on-chip RNG would probably be a good source to use

Only if NSA contractors don't play online poker in their spare time.

[+] gedrap|12 years ago|reply

I'm curious how actually random are current generators in online poker? I mean, some rather subtle patterns, situations would generate larger pots, therefore more rake. Or being on the new players side in 50/50 situations would 'help' to get him addicted.

I am not talking about 100% of the time dealing someone pocket kings, and someone else pocket aces and king on the flop.

Something subtle and very rare would be enough to count for large amounts of money at the end of the year, given the volume of major poker sites. On other hand, if someone would leak it, that might ruin the business for good.

[+] letstryagain|12 years ago|reply

The major sites don't do this. We know because many people out there collect literally millions of poker hands observed on these sites and mine the data for every kind of statistic you can think of. If anything significant was out of whack they would have picked it up. Look at the 'online poker' section of the twoplustwo forums for example.

The random number generators used by these sites are hardware systems that use micro fluctuations in ambient temperature (for example) as a source of entropy and they are very careful to use enough bits of entropy for every card shuffled.

[+] karmicthreat|12 years ago|reply

Even for bad poker sites they are using HW RNGs. I worked for a hybrid meat space/online casino and we were using IDQ cards. Except when we were not. Owner got squirrely and went for bottom dollar to implement a stand alone version of a carefully engineered system I had developed. The developers ended up using /dev/random instead of at least making a 1 gig rng lookup table. Also the ticketing system was unencrypted so you could print whatever you on a barcode and feed it to the redemption machine to clean it out. Also the games were not tested statistically. Literally a guy sat there and played games until it "felt" right.

I tried to help them out, but eventually I washed my hands of everything and walked away. Too many weird things going on. From talking to others in the industry it is not that different elsewhere.

TL;DR - If you do work for casinos be ready to walk away when things get weird.

[+] sanswork|12 years ago|reply

Since the odds of getting certain hands is known and there are a lot of professionals with very large databases of hands any manipulation like that would stand out pretty quickly as a statistical anomaly.

[+] foobarqux|12 years ago|reply

Professionals monitor the play of other players and identify statistically abnormalities. You would have to be clever and conservative enough to evade those types of checks which, while I am sure is possible, is not trivial.

[+] stephan10h|12 years ago|reply

Flaw #3 seems flawed to me. There are 52! possible ways to shuffle a deck of cards but a game is only played using a small subset. Suppose there are 4 players, then you need 2 times 4 plus 5 is 13 cards. The remaining deck of 39 cards can be shuffled in 39! ways without affecting the game. These possibilities are still included in those 52! of total possibilities. In case of 4 players there are only 52!/39! possible games that can be played. This is still a larger number then the 4 billion mentioned in the article but it doesn't dwarf the 4 billion as the 8*10^67 does.

[+] bloodmoney|12 years ago|reply

I admit I am a total noob here, but couldn't you make something with a TV turned to a station with just static? I have often wondered about this but lack the 'propriate schoolin'.

[+] lutusp|12 years ago|reply

In engineering terms, it's easier to use a reversed-biased diode as a noise source. An input circuit would transfer the diode's random waveform into a shift register as zeros and ones, until the desired word size has been assembled.

It's really quite simple, and it could produce a very high degree of randomness. It would differ from typical PRNGs in that the binary sequence could not be reproduced, no matter how much you knew about the circuit.

> I have often wondered about this but lack the 'propriate schoolin'.

That's an easily remedied problem. Remember what Mark Twain said: "I have never let my schooling interfere with my education".

[+] jlgaddis|12 years ago|reply

> That's where this program is for: adding entropy-data to the kernel-driver. It does that by fetching 2 images from a video4linux-device (with a random delay in between), calculating the difference between those two and then calculating the number of information-bits in that data. After that, the data with the number-of-entropy-bits is submitted to the kernel-random-driver.

http://www.vanheusden.com/ved/

There's also an "audio" version of, basically, the same thing [0] that I intend on using in the near future. It's as simple as tuning an FM radio to "nothing" and connecting its headphone/line out to your sound card's input line.

[0]: http://www.vanheusden.com/aed/

[+] unknown|12 years ago|reply

[deleted]

[+] redacted|12 years ago|reply

http://www.random.org/ uses atmospheric radio noise to generate random numbers, similar to what you are proposing.

[+] zaius|12 years ago|reply

A similar idea was done by silicon graphics using a lava lamp - http://en.wikipedia.org/wiki/Lavarand

[+] voltagex_|12 years ago|reply

Someone had a go at that exact idea https://github.com/pwarren/rtl-entropy

[+] akater|12 years ago|reply

I'm not a pro (not even an amateur, actually), but the very premise of “shuffling the deck” bewilders me. Shuffling the whole deck is so obviously bug-prone. Why not just pick random elements from decks instead? If I ever wrote a deck simulator I'd never shuffle anything, just picked 1 out of n < 52 when needed. Is this approach too naive and well-known to be somehow flawed as well?

[+] strangestchild|12 years ago|reply

You're simulating a full game of poker. Once a player has been given card 'i', you have to ensure that card 'i' isn't drawn again during the game. You could maintain a set containing all cards that have already been drawn, and re-select your random number if you draw a duplicate, but that's going to get awfully laggy once large numbers of cards have been drawn.

[+] o_nate|12 years ago|reply

Another lesson to be learned here is it's generally not a good idea to publish the code to a vulnerable, in-production system unless you are very, very sure there are no bugs.

[+] betterunix|12 years ago|reply

I like to use this as an example of a problem that secure multiparty computation can solve i.e. that you can remove a buggy / malicious central dealer from a system.

[+] himal|12 years ago|reply

[+] Kartificial|12 years ago|reply

Did this exploit actually got exploited? Or did they notify the site and gave them an opportunity to fix it before they released their findings?

[+] DerpDerpDerp|12 years ago|reply

This is one of those times you'd really want to use an actual random number generator, rather than a pseudo-random number generator.

[+] joshka|12 years ago|reply

PRNG would be fine if you could ensure that there's no leakage or reversible information.

[+] krexit|12 years ago|reply

Many regulatory licensing authorities of online gambling companies mandate the use of certified Hardware RNGs. Example: Alderney Gaming Commission. Certification includes Monte Carlo testing of the output, amongst other things.

[+] jonbarker|12 years ago|reply

true random based on atmospheric noise: random.org

[+] Ellipsis753|12 years ago|reply

That's a bad idea. You shouldn't be trusting random.org with your random data (what if they get hacked or something). Also if it's send over http then an attacker could listen in to the random data you were being sent (either at your end or at random.org). Ultimately I think you'd do best to use several software methods and 2 hardware methods and just xor them all together into a single secure source of random numbers. I mean, if you're doing this as a business the small cost of this is well worth not having to deal with your random source having issues.

[+] TempleOSV2|12 years ago|reply

[deleted]

[+] chops|12 years ago|reply

[deleted]

84 comments