This area is a big deal - ML networks need to be much deeper and denser to provide human-level understanding, and training networks is currently a considerable bottleneck.
Does this method make it easier to spread a neural network over multiple GPUs/machines? I mean, does it reduce the amount of data being communicated between compute nodes or just decouples the updates from the need to wait for the rest of the net to finish?
My gut thinks this sort of training alternative to back propagation has a lot of uses where SVM have no applicability. The article talks a lot about RNNs (neural nets for sequence prediction), but I would guess it would have uses in online learning as well. Learning twice as fast in those situations seems pretty significant to me.
edmack|9 years ago
visarga|9 years ago
alphonse23|9 years ago
Though, this article is so well presented, it deserves an awards for how pretty it is.
Anm|9 years ago