top | item 9023256

(no title)

Honest question: What's the use of compressing the counter for n smaller than log(n) bits? Even if you were counting something on the order of 10^30, log n is only around 100 bits. Wouldn't storing the algorithm take more space than you'd save through this counter?

discuss

maffydub|11 years ago

I think it depends on how many counters you want.

For example, I might want an approximate counter of number of reads of each row of a cache. Assume my cache has a lot (millions?) of rows, each of which stores a small amount of data. With the given algorithm, I could store the counter in a single byte per cache row. Without it, I'm probably looking at 4 or 8 bytes. It adds up.

As an aside, the article presents an alternative in which you have multiple simultaneous counters and take the median of the results to get finer bounds. I wonder how this compares with just using a single counter containing more bits and modifying the probability that you "increment" the counter?

[Edit: I realize the answer to my question about modifying the probability of a single counter may be that it's hard to get an unbiased random source for probabilities of 1/3, 1/5, etc.]

xxxyy|11 years ago

Algorithm is re-usable, i.e. you can count multiple events with the same algorithm. This way you get an array of bytes, or even half-bytes, instead of an array of very long integers. Can be useful, especially for embedded systems.