top | item 32532055 (no title) voqv | 3 years ago Is that why it took long? I was under the impression it was because of diminishing gradients in backprop once you stack a huge amount of layers (the deep in deep neural networks). discuss order hn newest No comments yet.
No comments yet.