(no title)
shoyer | 4 years ago
I don't think this is quite fair. There are several variations of 2nd order methods, notably KFAC and Shampoo, that seem to quite effective for large-scale neural network training, e.g., see the intro of this paper for an overview: https://openreview.net/forum?id=-t9LPHRYKmi
WithinReason|4 years ago
WithinReason|4 years ago