(no title)
chronolitus | 3 years ago
Of course, there's all the secret sauce to actually getting the models to learn anything, and all the empirical progress we make to make the training more efficient (ReLUs, etc). But how many of those are fundamental, vs. simply efficiency shortcuts? And: if you'd asked me 10 years ago what I thought it would take to get the kind of output these large models are getting these days, I would not have guessed anything nearly as simple as what those models actually are.
No comments yet.