top | item 46423158

(no title)

zokrezyl | 2 months ago

It's fine and thank you! I am playing arround with the idea, in theory all is good.. Only thing is that things like "first non ..." often involve branching that corrupts the prediction ability of the CPU. Therefore I kindly invited you to show it in code.

discuss

order

clausecker|2 months ago

You can find the first set bit in an integer with a machine instruction, it's completely branch free. gcc has __builtin_ctz() for this. You'll either need to iterate over all set bits (so one branch per set bit) or use a compression instruction (requiring AVX-512) to turn the bit set into a set of integers.

That said, as you seem to actually want to do something with the results, you'll take a branch per match anyway, so I don't see the problem.