(no title)
montebicyclelo | 4 months ago
Yep this kind of thing can happen. I found and reported incorrect gradients for Apple's Metal-backed tensorflow conv2d in 2021 [1].
(Pretty sure I've seen incorrect gradients with another Pytorch backend, but that was a few years ago and I don't seem to have raised an issue to refer to... )
One might think this class of errors would be caught by a test suite. Autodiff can be tested quite comprehensively against numerical differentiation [2]. (Although this example is from a much simpler lib than Pytorch, so I could be missing something.)
[1] https://github.com/apple/tensorflow_macos/issues/230
[2] https://github.com/sradc/SmallPebble/blob/2cd915c4ba72bf2d92...
gcr|4 months ago
liuliu|4 months ago
BTW, numeric differentiation can only be tested very limitedly (due to algorithmic complexity when you doing big matrix). It is much easier / effective to test against multiple implementations.
unknown|4 months ago
[deleted]
antoine-levitt|4 months ago