This is really elegant! To make the explanation more concrete, here's a snapshot of the algorithm's state (the dict D) after it has processed numbers up to 10 (inclusive):
12: [3, 2]
25: [5]
49: [7]
So each prime up to 10 is present exactly once. When we process 12, we move 2 to the list for 14, and 3 to the list for 15.
The Sorenson paper linked in the post is also a great read, I remember reading it a couple of years ago and it is very clear. It adds many further optimizations on top of standard sieve of Eratosthenes (e.g. wheel factorization is a generalization of the idea of considering only numbers in {1, 5} mod 6, or {1, 7, 11, 13, 17, 19, 23, 29} mod 30, or …), and segmenting is what you need to do to avoid running out of memory.
However, if you just want the nth prime, it's even faster to compute π(n) by combinatorial/analytic methods, instead of a sieve: the same author has https://github.com/kimwalisch/primecount for that. (Try the ten-trillionth prime number.)
My favorite is this FRACTRAN program: (17/91, 78/85, 19/51, 23/38, 29/33, 77/29, 1/17, 11/13, 13/11, 15/2, 1/7, 55/1) with input 2.
It doesn't quite produce primes directly. It gives a sequence of integers, some of which are powers of 2. Those powers of 2 are 2^2, 2^3, 2^5, 2^7, 2^11, ..., i.e., the sequence of 2 to the power of primes.
For those who have never seen FRACTRAN it is an esoteric programming language invented by John Conway. It is Turing-complete.
Here's how you execute a FRACTRAN program, with input N.
1. Step through the list of fractions in order until you find a fraction f such that fN is an integer or you run out of fractions.
The last time this formula came up, I added some references to the Wikipedia article (current version: https://en.wikipedia.org/w/index.php?title=Formula_for_prime... about the "worthlessness" of such formulas, but it doesn't deter people from being interested in that formula… as long as they don't imagine it's useful in any way, it's fine to enjoy its perverse cleverness, I suppose.
The video does well to quote Wilf on “the disease was preferable to the cure” and to end with “Could it [C. P. Willans] be a pseudonym for someone who didn't want to tarnish their reputation by writing about useless prime-generating formulas?” :-) Dudley (who mentions a large list of other such formulas) also has a nice conclusion:
> The conclusion to be drawn from all this, I think, is that formulas for formulas' sake do not advance the mathematical enterprise. Formulas should be useful. If not, they should be astounding, elegant, enlightening, simple, or have some other redeeming value.
Wow this is a great optimization! You're under-selling it :) It changes the memory requirement from O(π(n)) to O(π(√n)), e.g. after processing numbers up to 100, the internal state in the OP's algorithm is:
I enjoyed this, but I am a little stuck on how stuck the author is on yield. If one ends up creating a large list of prime numbers with this generator, why not just return a large list of prime numbers? The author clearly notes how optimizing Python code is "odd" when you can more easily switch languages, but languages like C/C++/Rust would not have pretty code because it does not have yield. It is obvious that returning an array of prime numbers is what one would do in C/C++/Rust.
I understand that not using yield would mean it is not strictly a "generator", but the spirit of this problem is generating the sequence of primes (unless I am missing something).
It's a bit of a convolution, but let's say we've got some iterator of unknown size and we want to have a prime each iteration.
We can do the following in python, which saves us from having to pre-compute the primes _or_ know how long `some_iterator` is.
primes = gen_primes()
for item, prime in zip(some_iterator, primes):
This implementation runs until it crashes. So you need a way to output the values as you go, or they will be lost when you run out of memory. It also lets you control how many values are generated if you just want to see a few without actually using all the RAM.
Can I just say how surprised I am the many people have a favourite prime number generator.
I can’t say I have a favourite but I do enjoy multiplying two random prime numbers with each other and using the random result when I need a random number with added random.
A way to add even more random is to generate a list of N random primes, then shuffling it M times before multiplying the elements. N and M are of course random positive integers.
To add maximum random, sometimes I like to return a string with a random Gertrude Stein quote, segfault, or make the PC speaker beep instead of returning a number. So random!
Nice, I like the optimization replacing the list, and just hunting for the next 'free' composite for the prime.
Teaching my kids coding, we also arrived at most of the other optimizations, in Janet:
(defn primes-sieve []
(fiber/new (fn []
(yield 2) # the only even prime, yield it explicitly.
(let [sieve (table/new (math/pow 2 12))]
(loop [n :range [3 nil 2]] # only odd numbers
(if-let [current-composite (sieve n)]
(do
(each prime-factor current-composite
(let [next-composite-key (+ n (* 2 prime-factor))] # add 2*prime-factor, else composite would be even
(if-let [next-composite (sieve next-composite-key)]
(array/push next-composite prime-factor)
(put sieve next-composite-key @[prime-factor]))))
(put sieve n nil))
(do
(yield n)
(put sieve (* n n) @[n])) # next composite is n^2. Anything less is composite a smaller prime, already in the sieve
))))))
import itertools as it
def erat3( ):
D = { 9: 3, 25: 5 }
yield 2
yield 3
yield 5
MASK= 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0,
MODULOS= frozenset( (1, 7, 11, 13, 17, 19, 23, 29) )
for q in it.compress(
it.islice(it.count(7), 0, None, 2),
it.cycle(MASK)):
p = D.pop(q, None)
if p is None:
D[q*q] = q
yield q
else:
x = q + 2*p
while x in D or (x%30) not in MODULOS:
x += 2*p
D[x] = p
This adds two things over the OP's initial version, both mentioned in the post: wheel factorization (with 30), and the "linear probing" to avoid lists. It's possible to keep improving (up to a point) by using e.g. wheel with 210 instead of 30, and so on, if one doesn't mind hard-coding more and more.
d = defaultdict(list)
# Magic
for i in range(2, 10000):
for y in i in d and d.pop(i) or [i]:
d[i+y] += (y,)
# Not sure how to print this in a succinct way
print(sorted(list(set().union(*d.values()))))
from collections import defaultdict
d = defaultdict(list)
for i in range(2, 10000):
for y in d.pop(i, [i]):
d[i + y].append(y)
for y in sorted(y for x in d.values() for y in x):
print(y)
[+] [-] tromp|2 years ago|reply
[1] https://tromp.github.io/cl/cl.html
[2] https://github.com/tromp/AIT/blob/master/characteristic_sequ...
[+] [-] cobaltoxide|2 years ago|reply
[+] [-] pixelpoet|2 years ago|reply
[+] [-] hgsgm|2 years ago|reply
[+] [-] MrJeffUtah|2 years ago|reply
[deleted]
[+] [-] svat|2 years ago|reply
The Sorenson paper linked in the post is also a great read, I remember reading it a couple of years ago and it is very clear. It adds many further optimizations on top of standard sieve of Eratosthenes (e.g. wheel factorization is a generalization of the idea of considering only numbers in {1, 5} mod 6, or {1, 7, 11, 13, 17, 19, 23, 29} mod 30, or …), and segmenting is what you need to do to avoid running out of memory.
The fastest such generator (using sieve-based methods) seems to be https://github.com/kimwalisch/primesieve — it implements all those optimizations and more.
However, if you just want the nth prime, it's even faster to compute π(n) by combinatorial/analytic methods, instead of a sieve: the same author has https://github.com/kimwalisch/primecount for that. (Try the ten-trillionth prime number.)
[+] [-] tzs|2 years ago|reply
It doesn't quite produce primes directly. It gives a sequence of integers, some of which are powers of 2. Those powers of 2 are 2^2, 2^3, 2^5, 2^7, 2^11, ..., i.e., the sequence of 2 to the power of primes.
For those who have never seen FRACTRAN it is an esoteric programming language invented by John Conway. It is Turing-complete.
Here's how you execute a FRACTRAN program, with input N.
1. Step through the list of fractions in order until you find a fraction f such that fN is an integer or you run out of fractions.
2. If you run out of fractions halt.
3. Output fN.
4. Replace N with fN.
5. Goto step 1.
Wikipedia gives some sample programs and explains in detail how the heck FRACTRAN can compute: https://en.wikipedia.org/wiki/FRACTRAN
[+] [-] SeanLuke|2 years ago|reply
p_n = 1 + Sum_{i=1}^{2^n} Floor((n/(Sum_{j=1}^{i} Floor((Cos((j-1)!+1)/j) Pi)^2)))^(1/n))
https://en.wikipedia.org/wiki/Formula_for_primes
https://www.youtube.com/watch?v=j5s0h42GfvM
It's not fast but it really does work.
[+] [-] svat|2 years ago|reply
The last time this formula came up, I added some references to the Wikipedia article (current version: https://en.wikipedia.org/w/index.php?title=Formula_for_prime... about the "worthlessness" of such formulas, but it doesn't deter people from being interested in that formula… as long as they don't imagine it's useful in any way, it's fine to enjoy its perverse cleverness, I suppose.
The video does well to quote Wilf on “the disease was preferable to the cure” and to end with “Could it [C. P. Willans] be a pseudonym for someone who didn't want to tarnish their reputation by writing about useless prime-generating formulas?” :-) Dudley (who mentions a large list of other such formulas) also has a nice conclusion:
> The conclusion to be drawn from all this, I think, is that formulas for formulas' sake do not advance the mathematical enterprise. Formulas should be useful. If not, they should be astounding, elegant, enlightening, simple, or have some other redeeming value.
[+] [-] scythe|2 years ago|reply
https://gist.github.com/scythe/5fb962722934c58c60430180beab8...
In the spirit of the blog post, I'll let you guess how it works :p
[+] [-] svat|2 years ago|reply
[+] [-] tantalor|2 years ago|reply
[+] [-] proactivesvcs|2 years ago|reply
[+] [-] omoikane|2 years ago|reply
https://github.com/llvm-mirror/libcxx/blob/78d6a7767ed57b501...
There is no good reason to use this one except in a code golf environment that includes all headers by default, which is where I learned about it.
[+] [-] ChrisCinelli|2 years ago|reply
It asymptotically speeds up the process of generating prime numbers. It has time complexity O(n/(log log n))
https://www.baeldung.com/cs/prime-number-algorithms
[+] [-] anonymoushn|2 years ago|reply
[+] [-] max_|2 years ago|reply
It is so fast, I wonder what the algorithm behind it is.
[0]: https://code.jsoftware.com/wiki/Vocabulary/pco
[+] [-] jonahx|2 years ago|reply
Noted in the details section of the link you provided:
"Primality testing on numbers larger than (2^31) uses the probabilistic Miller-Rabin algorithm."
[+] [-] noman-land|2 years ago|reply
[+] [-] philipLutz|2 years ago|reply
I understand that not using yield would mean it is not strictly a "generator", but the spirit of this problem is generating the sequence of primes (unless I am missing something).
[+] [-] vlovich123|2 years ago|reply
Maybe I'm misunderstanding you, but Rust has native generators with yield: https://doc.rust-lang.org/beta/unstable-book/language-featur...
C++ also does using coroutines + co_yield (https://en.cppreference.com/w/cpp/language/coroutines#co_yie...) or iterators (old syntax).
[+] [-] btilly|2 years ago|reply
[+] [-] masklinn|2 years ago|reply
But one does not always end up creating a large list of prime numbers, do they?
> why not just return a large list of prime numbers?
Because that requires upfront knowledge of the filtering factor or absence thereof.
A generator means you don't care, you generate an infinite sequence and the consumer is free to do whatever they want.
> It is obvious that returning an array of prime numbers is what one would do in C/C++/Rust.
It's certainly not obvious for Rust.
[+] [-] jameshart|2 years ago|reply
When writing ‘fizz buzz’, would you expect to return an array of 100 strings, some of which are numbers, some ‘fizz’es and some ‘buzz’es?
[+] [-] Vexs|2 years ago|reply
primes = gen_primes() for item, prime in zip(some_iterator, primes):
[+] [-] sp332|2 years ago|reply
[+] [-] nemo8551|2 years ago|reply
I can’t say I have a favourite but I do enjoy multiplying two random prime numbers with each other and using the random result when I need a random number with added random.
[+] [-] cobaltoxide|2 years ago|reply
[+] [-] andrelaszlo|2 years ago|reply
To add maximum random, sometimes I like to return a string with a random Gertrude Stein quote, segfault, or make the PC speaker beep instead of returning a number. So random!
[+] [-] naillo|2 years ago|reply
[+] [-] jacknews|2 years ago|reply
Teaching my kids coding, we also arrived at most of the other optimizations, in Janet:
[+] [-] tzot|2 years ago|reply
[+] [-] svat|2 years ago|reply
[+] [-] simlevesque|2 years ago|reply
[+] [-] rullelito|2 years ago|reply
[+] [-] paulddraper|2 years ago|reply
[+] [-] anaphor|2 years ago|reply
It basically uses the priority queue to store all the composites up to a certain point and then you can assume the next one is prime.
I had some (admittedly not well written) code for it too https://gist.github.com/weskerfoot/4699275
It can also incorporates the wheel factorization optimization mentioned in this article.
[+] [-] krnaveen14|2 years ago|reply
Memory Usage is greatly reduced by using BitSet.
https://github.com/krnaveen14/PrimeNumbers/blob/master/src/m...
https://codegolf.stackexchange.com/a/66915/41984
OMG, it's been 8 years already!
[+] [-] shaftoe444|2 years ago|reply
https://go.dev/play/p/9U22NfrXeq
[+] [-] oxxoxoxooo|2 years ago|reply
The Genuine Sieve of Eratosthenes https://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf
[+] [-] Exuma|2 years ago|reply