top | item 37007906

(no title)

Tunabrain | 2 years ago

GPUs are deterministic machines, even for floating point.

The behavior in the linked article has to do with the use of atomic adds to reduce sums in parallel. Floating point addition is not associative, so the order in which addition occurs matters. When using atomic adds this way, you get slightly different results depending on the order in which threads arrive at the atomic add call. It's a simple race condition, although one which is usually deemed acceptable.

discuss

order

n2d4|2 years ago

I just edited my comment while you were writing your comment to add an explanation. The point here is that some primitives in eg. cudNN are non-deterministic. Whether you classify that as a race condition or not is a different question; but it's intended behaviour.

xyzzy_plugh|2 years ago

Right but that's not an inherent GPU determinism issue. It's a software issue.

https://github.com/tensorflow/tensorflow/issues/3103#issueco... is correct that it's not necessary, it's a choice.

Your line of reasoning appears to be "GPUs are inherently non-deterministic don't be quick to judge someone's code" which as far as I can tell is dead wrong.

Admittedly there are some cases and instructions that may result in non-determinism but they are inherently necessary. The author should thinking carefully before introducing non-determinism. There are many scenarios where it is irrelevant, but ultimately the issue we are discussing here isn't the GPU's fault.

DeathArrow|2 years ago

If the hardware is deterministic, so are the results. You can't generate random numbers purely in software with deterministic hardware.