Paradoxes of Probability and Other Statistical Strangeness

[+] boreas|8 years ago|reply

For those who might be interested, and in a slightly different vein than the examples in the article, there's the "sleeping beauty" paradox: https://en.wikipedia.org/wiki/Sleeping_Beauty_problem

Basically, an agent is put to sleep and told they will be woken up once or twice, depending on the results of a fair coin flip, without the ability to remember other awakenings.

What probability does the agent assign to the event that the coin landed heads?

The intuitive response is 1/3, but this poses obvious epistemological problems. The agent has, ostensibly, no new information at all, and their prior is surely 1/2. Hope someone else finds this as interesting as I do!

[+] colonelxc|8 years ago|reply

I mostly find it interesting in that people could think that the chance is 1/3 (and that it may even be obvious!). After reading the description I can understand what they are getting at, but I think the conditional probability is messed up.

Instead of P(Monday | Heads) = P(Monday | Tails) = P(Tuesday | Tails) it is really P(Monday | Heads&Awake) = P(Monday | Tails&Awake) = P(Tuesday| Tails&Awake) or something like that. But the interviewer isn't asking about that, they are asking for the probability of the coin. The 3 positions are only exhaustive given that you are awake to be interviewed about them, not exhaustive of possible states (it's missing P(Tuesday | Heads&Asleep)). Since you're always awakened at least once, I find the argument that being awake has 'given you information that it is not tuesday AND heads' is pretty weak. While true, both heads and tails expect to be awoken while it is not both tuesday AND heads.

[+] bo1024|8 years ago|reply

It's very interesting and I don't think there's an obvious correct answer. It's hard to formally model mathematically.

Here's a game-theoretic perspective. In general, when an event has a 1/3 chance of happening, an idealized gambler would be indifferent between the following two bets or lottery tickets: (A) win $2 if the event happens; (B) win $1 if the event doesn't happen. (Notice her average payoff is 2/3 no matter which bet she takes.)

Now in the sleeping beauty problem where tails is two awakenings and heads is one, a gambler would be indifferent between (A) winning $2 every time she wakes up and the coin is heads, and (B) winning $1 every time she wakes up and the coin is tails. This suggests that her "belief" is 1/3.

Another way to put it might be that for a risk-neutral agent, doubling the payoff in one state of the world is equivalent to doubling its "perceived probability". In the sleeping beauty problem, doubling the payoff is like experiencing everything twice.

[+] Houshalter|8 years ago|reply

By far the most unintuitive paradox for me personally is the one presented here: https://youtu.be/go3xtDdsNQM?t=3m27s

"Mr. Jones has 2 children. What is the probability he has a girl if he has a boy born on Tuesday?" Somehow knowing the day of the week the boy was born changes the result. It's completely bizarre.

[+] justinpombrio|8 years ago|reply

The question is ill-posed: it does not give you enough information to tell the probability. You know what Mr. Jones has told you, but you don't know under what circumstances he would have told you this.

Suppose that you ask Mr. Jones weather he has a boy and he says yes. Then the probability that he also has a girl is 2/3.

Suppose that you asked Mr. Jones weather he had a boy born on a Tuesday, and he says yes. Then the probability that he has a girl is less than 2/3, because having two boys gives (about) double the chance for one of them to have been born on a Tuesday.

However, suppose that you asked Mr. Jones weather he has a boy, and if so what day his eldest boy was born on, and he says "yes, and on Tuesday". Then the probability that he also has a girl is again exactly 2/3.

Wikipedia has a detailed explanation: https://en.wikipedia.org/wiki/Boy_or_Girl_paradox

[+] Chinjut|8 years ago|reply

Your problem is that you are thinking there's a "the boy". But there's not a "the boy". Mr. Jones could have two boys. He could have two boys both born on Tuesday, even. The term "the boy" does not denote any particular boy, in that case, and causes you to think about the situation erroneously.

If the question were "There's Kid 1 and Kid 2, each independently selected with random gender and birth-day-of-the-week. Out of those cases where Kid 1 is a boy born on Tuesday, what proportion are cases where Kid 2 is a girl?", then the answer would indeed be a straightforward 50%; the status of Kid 1 is entirely independent of the status of Kid 2.

But that's not the question. The question is "There's Kid 1 and Kid 2, each independently selected with random gender and birth-day-of-the-week. Out of those cases where at least one (either one, and possibly both) of Kid 1 and Kid 2 is a boy born on Tuesday, what proportion are cases where at least one of Kid 1 and Kid 2 is a girl?".

This is very different, and of course just drawing out the possibilities (all 2 * 7 * 2 * 7 equiprobable-by-stipulation choices of gender and birth-day-of-the-week for Kid 1 and Kid 2) and circling which pairs of subsets are the relevant ones for the two questions reveals the difference, the probabilities for either question elementarily calculable in this way by basic counting.

[+] Jenya_|8 years ago|reply

The comments to this video actually say (with proof) that this was an error in the video.

[+] astrocat|8 years ago|reply

I'm an idiot, but I'm going to throw my hat in the ring here:

The video is wrong. The problem reads: Jones has 2 kids. What is P(he has a girl) given that he has a boy born on a Tuesday. Consider, for a moment, what information we're getting from "boy born on a Tuesday." This is no different than "boy with red hair," or "boy with 5 freckles." The fact that the BOY was born on a tuesday does not change P(day of the week girl was born). Imagine the "boy with 5 freckles" case - let 5 freckles be denoted by F5, six freckles by F6 and so on... would the appropriate calculation include enumerating P(boy F5, boy Fn) for all n? No.

The "born on Tuesday" is irrelevant. Thus you have the following scenarios: - one kid is TuesdayBoy and the other is also a boy, born at any time - one kid is TuesdayBoy and the other is a girl, born at any time

Out of these options P(Jones has a girl) is a flat out 50%. There is no need to bring in concepts of "which was born first" or enumerate all possible days of the week each child could have been born.

Ok... now all the real smartypants here can correct me :)

[+] Retric|8 years ago|reply

It's fairly simple.

If you flip 2 coins then say whatever the first coin was the the odds if you said H was HH, or HT and if you said T it would be TH, TT. However, if you flip two coins and then say if you got at least one head independently from whatever you flipped then the odds you have 3 options HT, HH, TH with equal odds.

So, the question is if the full statement was based on the data or only the truth value of the statement is based on the data.

PS: Now assuming it's truth value is based on data. if you look at all options there are 14 gender day combinations per kid and 14 * 14 = 196 gender day combinations in totoal. Only 14 of of those 196 start BT which is then split evenly 7 BTB_, 7 BTG_. However that leaves 196 - 14 other options to consider. 7 * 14 of them Start G, and 6 * 14 of them start with B not on a Tuesday, but out of those you only keep 1/14 as you need BT on the second roll. Now add them up 13B and 14G out of (13 + 14) = 27. Or 13/27 B, and 14/27G.

[+] georgewsinger|8 years ago|reply

Another "paradox": even though it's possible to randomly pick a rational number from the reals, the probability of this happening is 0.

[+] cdavid|8 years ago|reply

I find that result fairly intuitive, when you understand how measure theory came up to be.

A much more surprising result is that most irrational are normal numbers, but we know almost no normal number (morally speaking, a normal number is an irrational number where each digit is equiprobable in any base).

[+] BeetleB|8 years ago|reply

This is virtually an axiom for continuous distributions.

One of the axioms of probability is that if you have an event (i.e. a set), then the probability of a countable union of disjoint sets is the sum of the probability of each set (event) occurring.

Assume a uniform distribution between 0 and 1. Now consider point sets of the rationals (i.e. the number 0.5 is represented by a set with just 0.5 in it). Since the distribution is uniform, each set has the same probability (i.e. the likelihood of picking a random rational).

Now consider this question: What is the probability of picking any rational between 0 and 1? Well, that's just the sum of the probabilities over all rationals (because it is a countable sum of disjoint sets). If the probability of picking any particular rational was non-zero, this sum would be infinite, which violates the laws of probability.

Thus, by convention, it's just simpler to define it to be 0.

There's no magic here. These properties were picked merely to make analysis with measure theory clean. Don't try to ascribe any real world meaning to picking a point.

[+] philipov|8 years ago|reply

I think this only sounds like a paradox if it is phrased poorly. The accurate way to state it is "The probability of randomly picking a specific number is 0" and that sounds reasonable. The probability of successfully picking any number is 1.

[+] Sinergy2|8 years ago|reply

Please describe how it is possible to pick such a number. For example, I can readily imagine how to pick a random 32b float, but that it is an entirely problem with a nonzero probability.

[+] ramanan|8 years ago|reply

there are a number of similar paradoxes that arise when considering infinities of different sizes!

Infinity Paradoxes - Numberphile - https://www.youtube.com/watch?v=dDl7g_2x74Q

[+] btilly|8 years ago|reply

There is no uniform probability distribution on the reals.

Perhaps you meant the interval from 0 to 1?

[+] fitchjo|8 years ago|reply

My favorite statistical/probability paradox has always been the birthday paradox.

[+] beefield|8 years ago|reply

I don't know if Monty hall problem counts as a paradox, but that is quite high on my favourite list of counterintuitive probability results.

[+] curiousgal|8 years ago|reply

There is a 98.75% chance of some match of birthdays of the users who upvoted this post (57 at this moment)

[+] noam87|8 years ago|reply

For me it's Simpson's paradox: it throws everyone off -- it's caused (and will continue to cause) real-world damage, it's everywhere once you see it -- it's in how newspapers report science, it's in our social policy and how we talk about social issues, it's in court cases --, and finally, it's really hard to explain to a non-math person; so even when it's happening, you sound like the irrational one for pointing it out.

... and don't get cocky once you know about it, because it's so pernicious it'll get you too if you're not careful!

[+] pella|8 years ago|reply

https://en.wikipedia.org/wiki/Category:Statistical_paradoxes

[+] daxfohl|8 years ago|reply

"Paradox" is a pretty strong term. The items presented are more in the category of common errors and counter-intuitiveness.

[+] Houshalter|8 years ago|reply

https://en.wikipedia.org/wiki/Veridical_paradox

88 comments