top | item 46666448

(no title)

Nearly 20 years ago, I was working on indexing gigabytes of text on a mobile CPU, before smart phones caused massive investment in such CPUs. Word normalization logic (e.g., sky/skies/sky's -> sky) was very slow, so I used an in-memory cache, which sped it up immensely. Conceptually, the cache looked like {"sky": "sky", "skies": "sky", "sky's": "sky", "cats": "cat", ...}.

I needed cache eviction logic as there was only 1 MB of RAM available to the indexer, and most of that was used by the library that parsed the input format. The initial version of that logic emptied the entire cache when it hit a certain number of entries, just as a placeholder. When I got around to adding some LRU eviction logic, it became faster on our desktop simulator, but far slower on the embedded device (slower than with no cache at all). I tried several different "smart" eviction strategies. All of them were faster on the desktop and slower on the device. The disconnect came down to CPU cache (not word cache) size / strategy differences between the desktop and mobile CPUs — that was fun to diagnose!

We ended up shipping the "dumb" eviction logic because it was so much faster. The eviction function was only two lines of code, plus a large comment explaining all this and saying something to the effect of "yes, this looks dumb, but test speed on the target device when making it smarter."

discuss

No comments yet.