More parameters also means that the likelihood of overfitting (the training set) increases. Currently (and rather unintuitively, considering that ML is an applied optimization field, and optimization is usually concerned with underfitting), the bane of ML is overfitting. It's easy to supply a model with high representational capacity, but it's impossible to learn anything interesting in a reasonable amount of time. You'll learn how to fit your training set perfectly because your model has enough degrees of freedom to let you fit a million points arbitrarily well, but that doesn't mean that the resulting fit describes the data in a meaningful way. This is why a core tenet of ML is to prune parameters whenever possible. Neurogenesis increases representational capacity whenever it detects that your underlying model does not have sufficient representational capacity to fit the data; from this perspective, you start small (undercapacity) and then you gradually increase your capacity until you hit the optimal model. In other words, Neurogenesis is also a way for you to minimize the number of options.On the other hand, giving the model with more options than it necessarily needs and letting it decide what is important will usually backfire. Rather than learning a few meaningful/functional features, it can just go ahead and completely fit the training data from the very beginning. It will therefore decide that everything is important, because all those extraneous parameters will let it squeeze that last 0.5% out of your training set.
No comments yet.