top | item 38913036

(no title)

annowiki | 2 years ago

This is not really "sieve-ing" per the article, but what prevents me from running another process that periodically queries the data in a cache? Like just running a celery queue in Python that continually checks the cache for out of date information constantly updating it? Is there a word for this? Is this a common technique?

discuss

order

leoqa|2 years ago

I think this is not as simple, because to achieve good metrics (latency, cache hit) you will need to be predicting the actual incoming query load, which is quite hard. Letting the query load itself set the values is the state of the art.

In some ways, caching can be seen a prediction problem. And the cache hit rate is the error as we lag the previous history at time T. Blending load over time is effectively what these various cache algorithms do to avoid overfitting.

aleksiy123|2 years ago

If you have an idea of what you need to cache or can fit everything into the cache it's extremely effective.

Tho potentially just refreshing out of date data in the cache could increase effectiveness given that general assumption of the cache is whats in the cache will probably be used again.

I called it a periodically refreshing cache when I wrote one. Not sure if there is a more formal name.

toast0|2 years ago

You might call that prefetching. That's what unbound calls it when it returns a near expired cached entry to a client and also contacts upstream to update its cache. I remember having a similar option in squid, but it might have been only in an employer's branch (there were a lot of nice extensions that unfortunately didn't make it upstream)

ImPostingOnHN|2 years ago

You're describing cache maintenance (and cache eviction), a practice for which there are many algorithms (FIFO, LRU, LFU, etc.), including the algorithm the article describes (SIEVE)

aleksiy123|2 years ago

I think this is orthogonal to cache maintenance and cache eviction. Instead this is having a background process periodically refreshing the data in the cache to keep it hot.