top | item 32532055

(no title)

voqv | 3 years ago

Is that why it took long? I was under the impression it was because of diminishing gradients in backprop once you stack a huge amount of layers (the deep in deep neural networks).

discuss

order

No comments yet.