top | item 39012624

(no title)

asQuirreL | 2 years ago

You're right -- I got nerd-sniped by this, and ended up rewriting the benchmarks in Rust. The compiler successfully inlines the square root operation, simplifies the exponentiation into multiplication and then does common-subexpression elimination over it, but no induction variable analysis, and therefore also no loop-invariant code motion. Here's it is in Godbolt:

https://godbolt.org/z/Pa6cqd1fb

But the funny thing is, I tried it on my machine, and guess what, the "unoptimised" version is consistently between 4 and 5 times faster than the hand-optimised variants:

https://gist.github.com/amnn/4be5c05975c250d0ea88b62c03f1719...

So that's fun.

discuss

order

ColinWright|2 years ago

> ... the "unoptimised" version is consistently between 4 and 5 times faster than the hand-optimised variants:

I'm getting this comment a lot, but so far the reason has been that people are using smallish numbers, and implementing the square root in floats.

In the context where this exploration is taking place, you can't do that. We're working with exact huge integers, and while a square root on floats can get you in the ballpark, you then need to refine the answer to get the exact value, and that takes order of magnitude the same time.

asQuirreL|2 years ago

How big are we talking? The benchmarks I tried the two numbers in your example, but yes, if you go above 2^53, then you will run into trouble with the naive solution, but that is for correctness reasons, not performance reasons.