top | item 42977560

(no title)

jsmcgd | 1 year ago

Instead of discounting some of the results, can we alternate the value of the coin each toss? So on the first flip, heads is 0 and tails is 1, then on the next flip, heads is 1 and tails is 0.

discuss

hansvm|1 year ago

That'll obtain the right average but won't have the same pairwise relations as an independent, unbiased coin.

That's probably easiest to see if you imagine approaching an infinitely biased coin (100% heads, 0% tails). Your strategy alternates between 0 and 1 almost always. The listed strategy throws away most flips but gives actualy unbiased results when a pair does pass.

Another way to look at it is from an entropy perspective. An unbiased, independent coin flip has 1 bit of entropy. A biased coin with, e.g., 99% heads has 0.0807 bits of entropy. On average, you need at least 12.377 such flips to emulate an unbiased, independent flip. Any strategy without some sort of rejection/continuation/... (like your proposal) is doomed to fail.

I haven't checked if their proposal is actually optimal. Empirically, it's suggestive of having room for improvement. I'm seeing something like 101 flips on average instead of 12.377 for that 99% bias example.

hatthew|1 year ago

A general direction you could go is to use blocks greater than 2. For example, you flip the coin exactly 64 times, and discard the result unless there is exactly 1 tails and 63 heads. This happens about 22% of the time, so it's on average 290 flips to get a single sample. When you do get that sample, you convert the single tails' position within that 64 block into binary, and get a 6 bit number uniformly distributed 0-63, i.e. you get 6 bits of entropy. So on average 6/290=0.02 bits of entropy per flip, twice as good as using blocks of 2, though only a quarter as good as the theoretical upper bound.

I picked "block of 64 with only a single tails" since it was simple, and I'm sure a mathematician could figure out how to optimize it much more, but my general point is to agree that there's definitely ways to get closer to the theoretical upper bound you mentioned.