top | item 25017493

(no title)

subtypefiddler | 5 years ago

It's also important to note that they work despite being wide, you can see that with the efficiency of pruning, and ideas such as the lottery ticket hypothesis that state that "successful" sub-networks within the wide network account for most of the performance.

In the theory literature, if you have a K-deep network, K=1 is the shallow case, K>1 is deep. Agreed naming could be better, but it's not like "deep work" or "deep thoughts" as the parent was stating.

discuss

order

No comments yet.