top | item 38861432

(no title)

1a1a11a | 2 years ago

I agree with you; any non-random algorithm has the problem

discuss

Well, any pseudo-random algorithm. If we avoid pseudo-random behaviors, we can provide certain guarantees that no pattern of access can get around, such as that if an item is freshly inserted into the cache, then it will stay there for (at least) the next N cache misses, for some specific N; that those N misses will evict something else.

If we consider LRU, no contrived pattern of access on LRU will cause it to evict some item Y if there is an item Z present which was less recently touched than Y.

NovaX|2 years ago

Those guarantees are also what makes the algorithm exploitable and inflexible to handle unknown or adversary workloads. Your LRU example suffers in scan / loop / zigzag patterns, which the author's policies do poorly at as well. They target CDNs workloads where that can be ignored there due to its scale, but that is not a generalizable assumption for other workloads where it more common like databases, search, and analytics.

If the algorithm is known and accepts user facing requests then it becomes a target. A client can use the network speed of a dialup modem to bring down a modern server simply by crafting a request pattern to exploit hash flooding. This is why you generally want some injected noise, adaptivity, and use an algorithm that offers robustly good hit rates. A specialized policy that is superior in a narrow set of workload is still very useful when targeted at an expensive problem, but that's a design assumption that few can reasonably make.