top | item 35876635

Polya Urn Simulation

50 points| cmoog | 2 years ago |observablehq.com

19 comments

order

bjornsing|2 years ago

I must admit: this went against my intuition. My first guess was that you would end up with an urn full of either red or blue balls.

OscarCunningham|2 years ago

Bear in mind the default behaviour if there were just two balls and you never added any more. Then the proportion of red picks vs blue picks would tend to 1/2. So there's naturally a tendency for the proportion to concentrate in the middle.

As you say, the way in which new balls are added tends to push the proportion towards the extremes.

The uniform distribution is the result of these two tendencies exactly cancelling out.

cmoog|2 years ago

For me as well. And when my stochastic probability professor posed this question to the class by way of hands, it was nearly unanimous in favor of the 0/100% end behavior.

xeyownt|2 years ago

Yeah, I don't know what my intuition was.

But the problem is symmetric, and even if pick a red, you end up with two reds and one blue, so not so much imbalanced.

And even if the mix becomes really imbalanced, say 7 red and one blue, picking the rarest color will have more effect than picking the most common one. So you could consider that the system tries to balance itself naturally, hence avoiding huge swings in some direction or the other.

foobarbecue|2 years ago

Me too, but only because I was expecting something interesting to happen since it was on HN!

theK|2 years ago

Pretty sure the variables the author picked are not the most interesting ones.

Urn models are engineered to have a rich get richer bias which is best seen by varying the initial populations.

Instead of offering trial count and pick counts which are (invariates in the actual model) he could have picked initial ball count and initial white/red ratio.

cmoog|2 years ago

Ah, good idea! I'll add those in a bit and will remark that the answer/proof are specific for the special case where r_0 = b_0 = 1.

planede|2 years ago

The proof seems to concentrate on the marginal distribution as n goes to infinity. But the simulation hints at something more interesting: each sample of the random process seems to converge to a value, where the value itself is U(0,1).

Is it true that a sample of the random process is convergent with probability 1?

theK|2 years ago

> After a large number of picks, what is the behavior of the proportion of red balls in the urn

Isn’t the more enticing question how strong the bias towards the first picked Color is?

fjfaase|2 years ago

The rather boring answer is 2/3. Logic seems to indicate that it will be a linear distribution where the change for only balls of the first picked Color is maximum and only balls of the other Color is zero, because it is no longer possible to only pick balls of the other Color. It must be linear because it needs to be symmetric and lead to a uniform distribution if added together.

After two picks, you have three cases. If you picked two different colored balls, the distribution should be a uniform distribution again, just like the initial state. The two other distribution should mirror eachother, and thus be linear again.

Maybe that something interesting happens with three picks. Or maybe, you always end up with linear distributions with tilted slopes. In that case it is rather boring.

kgwgk|2 years ago

The expected terminal fraction is always equal to the current fraction. (Or is it?)