(no title)
vlmutolo | 10 months ago
So a single query to the filter should only have one or two cache misses, depending on the size of your block. Or even if your block is larger than a cache line, you can probably issue all the loads at once and only pay for the latency of one memory access.
The downside of doing this is slightly more space usage relative to simple Bloom filters. I’d almost always reach for block Bloom filters, though, once the filter becomes a significant fraction of cache size.
I implemented block bloom filters for fairly large (~GB) arrays and saw about 35ns performance. They’re excellent data structures, pretty much as fast as you can get for approximate membership tests (though other filters have better space-time tradeoffs).
No comments yet.