(no title)
gregable | 9 months ago
There's also a distributed version, easy with a map reduce.
Or the very simple algorithm: generate a random paired for each item in the stream and keep the top N ordered by that random.
gregable | 9 months ago
There's also a distributed version, easy with a map reduce.
Or the very simple algorithm: generate a random paired for each item in the stream and keep the top N ordered by that random.
tmoertel|9 months ago
I discuss these issues more here: https://blog.moertel.com/posts/2024-08-23-sampling-with-sql....