(no title)
cleancoder0 | 4 years ago
The fact that layers themselves are narrower means that training and evaluation of the NN is also much faster.
cleancoder0 | 4 years ago
The fact that layers themselves are narrower means that training and evaluation of the NN is also much faster.
oofbey|4 years ago
chabons|4 years ago
londons_explore|4 years ago
Since the network will have seen far more english text than text of other languages, it suggests that performance on limited training data is more improved.
modeless|4 years ago
yoquan|4 years ago