top | item 21870083

(no title)

ccapo | 6 years ago

I remember reading about Kahan summation, and I naively thought it would help improve the accuracy/consistency of a n-body code I was using for my PhD. I was trying to resolve a discrepancy between summations when performed serially versus in parallel using OpenMP. In the end I abandoned the approach, since it seems the issue lay elsewhere.

discuss

kardos|6 years ago

I've employed this Kahan summation in a different physics simulation that was giving parallel vs serial discrepancies due to global sums that sum in different orders in parallel. It did help, but I ultimately concluded that it isn't a complete solution for this issue... in short, I think it just bumps the problem several orders of magnitude down.

For what it's worth, and in the event that your issue was indeed the differing order of summing-- there is a solution that /does/ work, which is to cast the numbers to a large (eg, 256 or 512 bit) fixed point representation and leverage the reproducibility of integer sums.

snovv_crash|6 years ago

The issue here is that unless you have checks that your values dont over or underflow at every stage of the calculation, you can never actually know if your answers are accurate, even if they might be repeatable.