My favorite prime number generator

[+] tromp|2 years ago|reply

My favorite prime number generator is

    ┬─┬───────────────────────────────────┬──── ────┬
    │ │ ┬─┬ ────┬─┬────────────────────── ┼───┬ ┬───┼
    │ │ └─┤ ────┼─┼───────────────────┬── ┼───┼ │ ┬ │
    │ │   │ ┬───┼─┼───────────────────┼── ┼─┬─┼ │ ┼ │
    │ │   │ │ ─ ┼─┼─┬─────┬──── ──────┼─┬ │ ├─┘ └─┤ │
    │ │   │ │ ┬ └─┤ │ ┬─┬ ┼─┬─┬ ──┬───┼─┼ ├─┘     ├─┘
    │ │   │ └─┤   └─┤ └─┤ │ ├─┘ ──┼─┬─┼─┼ │       │
    │ │   │   │     │   │ ├─┘   ┬─┼─┼─┼─┼ │       │
    │ │   │   │     │   ├─┘     └─┤ │ ├─┘ │       │
    │ │   │   │     └───┤         │ ├─┘   │       │
    │ │   │   │         │         ├─┘     │       │
    │ │   │   │         ├─────────┘       │       │
    │ │   │   ├─────────┘                 │       │
    │ │   └───┤                           │       │
    │ │       ├───────────────────────────┘       │
    │ ├───────┘                                   │
    └─┤                                           │
      └───────────────────────────────────────────┘

which graphically depicts the 167-bit lambda term

    λ 1 (1 ((λ 1 1) (λ λ λ 1 (λ λ 1) ((λ 4 4 1 ((λ 1 1) (λ 2 (1 1)))) (λ λ λ λ 1 3 (2 (6 4))))) (λ λ λ 4 (1 3)))) (λ λ 1 (λ λ 2) 2)

which generates the characteristic sequence of primes [1] [2].

[1] https://tromp.github.io/cl/cl.html

[2] https://github.com/tromp/AIT/blob/master/characteristic_sequ...

[+] cobaltoxide|2 years ago|reply

Could you explain it? Those two links are Greek to me.

[+] pixelpoet|2 years ago|reply

Super cool to see one of the authors of the most popular Go scoring methods (particularly for MCTS) posting here :) Respect and thanks!

[+] hgsgm|2 years ago|reply

What's the "characteristic sequence"?

[+] MrJeffUtah|2 years ago|reply

[deleted]

[+] svat|2 years ago|reply

This is really elegant! To make the explanation more concrete, here's a snapshot of the algorithm's state (the dict D) after it has processed numbers up to 10 (inclusive):

    12: [3, 2]
    25: [5]
    49: [7]

So each prime up to 10 is present exactly once. When we process 12, we move 2 to the list for 14, and 3 to the list for 15.

The Sorenson paper linked in the post is also a great read, I remember reading it a couple of years ago and it is very clear. It adds many further optimizations on top of standard sieve of Eratosthenes (e.g. wheel factorization is a generalization of the idea of considering only numbers in {1, 5} mod 6, or {1, 7, 11, 13, 17, 19, 23, 29} mod 30, or …), and segmenting is what you need to do to avoid running out of memory.

The fastest such generator (using sieve-based methods) seems to be https://github.com/kimwalisch/primesieve — it implements all those optimizations and more.

However, if you just want the nth prime, it's even faster to compute π(n) by combinatorial/analytic methods, instead of a sieve: the same author has https://github.com/kimwalisch/primecount for that. (Try the ten-trillionth prime number.)

[+] tzs|2 years ago|reply

My favorite is this FRACTRAN program: (17/91, 78/85, 19/51, 23/38, 29/33, 77/29, 1/17, 11/13, 13/11, 15/2, 1/7, 55/1) with input 2.

It doesn't quite produce primes directly. It gives a sequence of integers, some of which are powers of 2. Those powers of 2 are 2^2, 2^3, 2^5, 2^7, 2^11, ..., i.e., the sequence of 2 to the power of primes.

For those who have never seen FRACTRAN it is an esoteric programming language invented by John Conway. It is Turing-complete.

Here's how you execute a FRACTRAN program, with input N.

1. Step through the list of fractions in order until you find a fraction f such that fN is an integer or you run out of fractions.

2. If you run out of fractions halt.

3. Output fN.

4. Replace N with fN.

5. Goto step 1.

Wikipedia gives some sample programs and explains in detail how the heck FRACTRAN can compute: https://en.wikipedia.org/wiki/FRACTRAN

[+] SeanLuke|2 years ago|reply

Believe it or not, there exists a closed-form equation which computes the nth prime number. It's Willans's formula:

p_n = 1 + Sum_{i=1}^{2^n} Floor((n/(Sum_{j=1}^{i} Floor((Cos((j-1)!+1)/j) Pi)^2)))^(1/n))

https://en.wikipedia.org/wiki/Formula_for_primes

https://www.youtube.com/watch?v=j5s0h42GfvM

It's not fast but it really does work.

[+] svat|2 years ago|reply

That video is actually great!

The last time this formula came up, I added some references to the Wikipedia article (current version: https://en.wikipedia.org/w/index.php?title=Formula_for_prime... about the "worthlessness" of such formulas, but it doesn't deter people from being interested in that formula… as long as they don't imagine it's useful in any way, it's fine to enjoy its perverse cleverness, I suppose.

The video does well to quote Wilf on “the disease was preferable to the cure” and to end with “Could it [C. P. Willans] be a pseudonym for someone who didn't want to tarnish their reputation by writing about useless prime-generating formulas?” :-) Dudley (who mentions a large list of other such formulas) also has a nice conclusion:

> The conclusion to be drawn from all this, I think, is that formulas for formulas' sake do not advance the mathematical enterprise. Formulas should be useful. If not, they should be astounding, elegant, enlightening, simple, or have some other redeeming value.

[+] scythe|2 years ago|reply

The memory usage can in fact be optimized further with a small change:

https://gist.github.com/scythe/5fb962722934c58c60430180beab8...

In the spirit of the blog post, I'll let you guess how it works :p

[+] svat|2 years ago|reply

Wow this is a great optimization! You're under-selling it :) It changes the memory requirement from O(π(n)) to O(π(√n)), e.g. after processing numbers up to 100, the internal state in the OP's algorithm is:

    q = 101
    D = {'102': [3, 2], '105': [7, 5], '121': [11], '169': [13], '289': [17], '361': [19], '529': [23], '841': [29], '961': [31], '1369': [37], '1681': [41], '1849': [43], '2209': [47], '2809': [53], '3481': [59], '3721': [61], '4489': [67], '5041': [71], '5329': [73], '6241': [79], '6889': [83], '7921': [89], '9409': [97]}

while with your "small change", the state is just:

    q = 101
    S = 121
    r = 11
    D = {'102': [3, 2], '105': [7, 5]}

[+] tantalor|2 years ago|reply

My favorite prime number generator is http://alpha61.com/primenumbershittingbear/

[+] proactivesvcs|2 years ago|reply

It doesn't need Flash Player any more? It's the year 2023 and I can run the Prime Number Shitting Bear again? Hallelujah!

[+] omoikane|2 years ago|reply

My favorite prime number generator is the undocumented __next_prime():

https://github.com/llvm-mirror/libcxx/blob/78d6a7767ed57b501...

There is no good reason to use this one except in a code golf environment that includes all headers by default, which is where I learned about it.

[+] ChrisCinelli|2 years ago|reply

The Sieve of Atkin is VERY fast at generating primes between 1 and n.

It asymptotically speeds up the process of generating prime numbers. It has time complexity O(n/(log log n))

https://www.baeldung.com/cs/prime-number-algorithms

[+] anonymoushn|2 years ago|reply

Your link seems to say it has time complexity O(n)?

[+] max_|2 years ago|reply

In J generating prime numbers is simply p:21, that gives you 79 which is the 21st prime number.

It is so fast, I wonder what the algorithm behind it is.

[0]: https://code.jsoftware.com/wiki/Vocabulary/pco

[+] jonahx|2 years ago|reply

> I wonder what the algorithm behind it is.

Noted in the details section of the link you provided:

"Primality testing on numbers larger than (2^31) uses the probabilistic Miller-Rabin algorithm."

[+] noman-land|2 years ago|reply

How high can you go with this?

[+] philipLutz|2 years ago|reply

I enjoyed this, but I am a little stuck on how stuck the author is on yield. If one ends up creating a large list of prime numbers with this generator, why not just return a large list of prime numbers? The author clearly notes how optimizing Python code is "odd" when you can more easily switch languages, but languages like C/C++/Rust would not have pretty code because it does not have yield. It is obvious that returning an array of prime numbers is what one would do in C/C++/Rust.

I understand that not using yield would mean it is not strictly a "generator", but the spirit of this problem is generating the sequence of primes (unless I am missing something).

[+] vlovich123|2 years ago|reply

> but languages like C/C++/Rust would not have pretty code because it does not have yield

Maybe I'm misunderstanding you, but Rust has native generators with yield: https://doc.rust-lang.org/beta/unstable-book/language-featur...

C++ also does using coroutines + co_yield (https://en.cppreference.com/w/cpp/language/coroutines#co_yie...) or iterators (old syntax).

[+] btilly|2 years ago|reply

Try solving some Project Euler problems. You'll see the win pretty quickly when you have to search a large range for primes.

[+] masklinn|2 years ago|reply

> If one ends up creating a large list of prime numbers with this generator

But one does not always end up creating a large list of prime numbers, do they?

> why not just return a large list of prime numbers?

Because that requires upfront knowledge of the filtering factor or absence thereof.

A generator means you don't care, you generate an infinite sequence and the consumer is free to do whatever they want.

> It is obvious that returning an array of prime numbers is what one would do in C/C++/Rust.

It's certainly not obvious for Rust.

[+] jameshart|2 years ago|reply

If the problem is outputting the primes, then storing them in a list is not a necessary part of that problem.

When writing ‘fizz buzz’, would you expect to return an array of 100 strings, some of which are numbers, some ‘fizz’es and some ‘buzz’es?

[+] Vexs|2 years ago|reply

It's a bit of a convolution, but let's say we've got some iterator of unknown size and we want to have a prime each iteration. We can do the following in python, which saves us from having to pre-compute the primes _or_ know how long `some_iterator` is.

primes = gen_primes() for item, prime in zip(some_iterator, primes):

[+] sp332|2 years ago|reply

This implementation runs until it crashes. So you need a way to output the values as you go, or they will be lost when you run out of memory. It also lets you control how many values are generated if you just want to see a few without actually using all the RAM.

[+] nemo8551|2 years ago|reply

Can I just say how surprised I am the many people have a favourite prime number generator.

I can’t say I have a favourite but I do enjoy multiplying two random prime numbers with each other and using the random result when I need a random number with added random.

[+] cobaltoxide|2 years ago|reply

That doesn't seem like a good way to generate "random" numbers, since it will only produce numbers with exactly two prime factors.

[+] andrelaszlo|2 years ago|reply

A way to add even more random is to generate a list of N random primes, then shuffling it M times before multiplying the elements. N and M are of course random positive integers.

To add maximum random, sometimes I like to return a string with a random Gertrude Stein quote, segfault, or make the PC speaker beep instead of returning a number. So random!

[+] naillo|2 years ago|reply

This person has an incredible work ethic

[+] jacknews|2 years ago|reply

Nice, I like the optimization replacing the list, and just hunting for the next 'free' composite for the prime.

Teaching my kids coding, we also arrived at most of the other optimizations, in Janet:

  (defn primes-sieve []
    (fiber/new (fn []
      (yield 2) # the only even prime, yield it explicitly.
  
      (let [sieve (table/new (math/pow 2 12))]
        (loop [n :range [3 nil 2]]  # only odd numbers
          (if-let [current-composite (sieve n)]
            (do 
              (each prime-factor current-composite
                (let [next-composite-key (+ n (* 2 prime-factor))] # add 2*prime-factor, else composite would be even
                  (if-let [next-composite (sieve next-composite-key)]
                    (array/push next-composite prime-factor)
                    (put sieve next-composite-key @[prime-factor]))))
              (put sieve n nil))
            (do
              (yield n)
              (put sieve (* n n) @[n])) # next composite is n^2. Anything less is composite a smaller prime, already in the sieve
            ))))))

[+] tzot|2 years ago|reply

The fastest variant I know of this algorithm is the one found at https://stackoverflow.com/a/3796442/6899 :

    import itertools as it
    def erat3( ):
        D = { 9: 3, 25: 5 }
        yield 2
        yield 3
        yield 5
        MASK= 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0,
        MODULOS= frozenset( (1, 7, 11, 13, 17, 19, 23, 29) )

        for q in it.compress(
                it.islice(it.count(7), 0, None, 2),
                it.cycle(MASK)):
            p = D.pop(q, None)
            if p is None:
                D[q*q] = q
                yield q
            else:
                x = q + 2*p
                while x in D or (x%30) not in MODULOS:
                    x += 2*p
                D[x] = p

[+] svat|2 years ago|reply

This adds two things over the OP's initial version, both mentioned in the post: wheel factorization (with 30), and the "linear probing" to avoid lists. It's possible to keep improving (up to a point) by using e.g. wheel with 210 instead of 30, and so on, if one doesn't mind hard-coding more and more.

[+] simlevesque|2 years ago|reply

isn't hardcoding 2 3 and 5 a bit of a cheat ? It seems like you could hardcode it to yield the X next primes and it'd be faster.

[+] rullelito|2 years ago|reply

My favorite one:

    d = defaultdict(list)
    
    # Magic
    for i in range(2, 10000):
            for y in i in d and d.pop(i) or [i]:
                d[i+y] += (y,)
    
    # Not sure how to print this in a succinct way
    print(sorted(list(set().union(*d.values()))))

[+] paulddraper|2 years ago|reply

Beautiful

  from collections import defaultdict

  d = defaultdict(list)
  for i in range(2, 10000):
      for y in d.pop(i, [i]):
          d[i + y].append(y)
  for y in sorted(y for x in d.values() for y in x):
      print(y)

[+] anaphor|2 years ago|reply

There's a functional / lazy version I like that uses a priority queue in order to get the next prime number https://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf

It basically uses the priority queue to store all the composites up to a certain point and then you can assume the next one is prime.

I had some (admittedly not well written) code for it too https://gist.github.com/weskerfoot/4699275

It can also incorporates the wheel factorization optimization mentioned in this article.

[+] krnaveen14|2 years ago|reply

Reminds me of the time where I've implemented highly optimised version of Sieve of Eratosthenes in Java, just for a Code Golf Challenge.

Memory Usage is greatly reduced by using BitSet.

https://github.com/krnaveen14/PrimeNumbers/blob/master/src/m...

https://codegolf.stackexchange.com/a/66915/41984

OMG, it's been 8 years already!

[+] shaftoe444|2 years ago|reply

My favourite prime number generator is this one using go channels to create filters. Not efficient but fun.

https://go.dev/play/p/9U22NfrXeq

[+] oxxoxoxooo|2 years ago|reply

And another one:

The Genuine Sieve of Eratosthenes https://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf

[+] Exuma|2 years ago|reply

He said he has "used this code many times"... have you ever needed to find prime numbers in your daily programming?

83 comments