top | item 11246649

(no title)

borkabrak | 10 years ago

She's right about squaring on each iteration, though. And granted, a square root is much more expensive, though done only once. Which method is faster would depend on the number of iterations.

discuss

order

uxcn|10 years ago

The latency for sqrtss on broadwell is 11 cycles with a throughput of 4, where mul is a latency of 3 and throughput of 1. So, using some concrete numbers, sqrt is more expensive, but not polynomially or even an order of magnitude.

JoeAltmaier|10 years ago

Right with modern floating point implementations its not the old guess-and-iterate method any more. SQRT is probably now on the order of an inverse?