top | item 43357301

(no title)

pash | 11 months ago

The general version of this is called inverse transform sampling [0], which uses the fact that for the cdf F of any random variable X the random variable Y = F(X) has a standard uniform distribution [1]. Since every cdf increases monotonically on the unit interval, every cdf is invertible [2]. So apply the inverse cdf to both sides of the previous equation and you get F^-1(Y) = X is distributed like X.

Sampling from a standard uniform distribution and then using the inverse transform is the commonest way of generating random numbers from an arbitrary distribution.

0. https://en.m.wikipedia.org/wiki/Inverse_transform_sampling

1. https://en.m.wikipedia.org/wiki/Probability_integral_transfo...

2. Not every cdf is one-to-one, however, so you may need a generalized inverse.

discuss

order

tmoertel|11 months ago

For the particular case of the exponential distribution we can go further. By taking advantage of the theory of Poisson processes, we can take samples using a parallel algorithm. It even has a surprisingly succinct SQL translation:

    SELECT *
    FROM Population
    WHERE weight > 0
    ORDER BY -LN(1.0 - RANDOM()) / weight
    LIMIT 100  -- Sample size.
Notice our exponentially distributed random variable on prominent display in the ORDER BY clause.

If you're curious, I explore this algorithm and the theory behind it in https://blog.moertel.com/posts/2024-08-23-sampling-with-sql....

lkuty|11 months ago

Quite off-topic, but do you know when you'll write the article about CPS, if ever?