top | item 44715435

(no title)

axeluser | 7 months ago

Hi, thank you for your questions: 1. The "one-line" boolean condition still has jumps in the produced JIT Asm, at least to make a short circuit to exit as soon as possible. I guess the C# compiler is far too conservative for such optimisations.

2 and 3. The "operations" are both the array's size (actually the nearest power of 2) and lookup operations. I first thought that latency should be equal for both cases, because my benchmarks may be too naive - they test lookup by sequential memory access, i.e., iterating buckets from 0 to n, and iterating from first to last element in each bucket. I'll try to experiment with a more shuffled workload in my free time. Thank you for the question. I've not thought about it deeply. Also, counting mispredictions is a good idea; I'll add this metric to my benchmarks to get a complete picture.

4. I didn't pay much attention to that because the trend was still the same, and even the ratio was similar for different counts of lookups per underlying memory layout. There was just some minor deviation, but I ignored it. Maybe I'll run more granular tests with hardware counters to catch some insights.

5. SIMD, in my opinion, would be an overhead there because for a single lookup in my configuration, I'm using a single 32-bit value, and Cuckoo Filter will touch at max 64 bits (2 lookups). However, I was planning to experiment with batching of lookups; I just haven't reached it yet.

discuss

order

curiouscoding|7 months ago

> they test lookup by sequential memory access, i.e., iterating buckets from 0 to n, and iterating from first to last element in each bucket

Yeah; that completely eliminates the cache misses / memory latency you'd have in practice. Of course eliminating that bottleneck is useful if you want to purely optimize CPU speed, but probably not quite as representative of real workloads. Also makes sense then that different array sizes give similar results: streaming tends to be fast anyway, regardless of the array size.

maxlapdev|7 months ago

The one liner shouldn't have shortcircuiting since it uses the binary or |, not the boolean/logical or||.

axeluser|7 months ago

Aha, you're right, didn't look carefully