top | item 39609576

(no title)

hashta | 2 years ago

I have a lot of experience working with both families of models. If you use an ensemble of 10 NNs, they outperform well-optimized tree-based models such as XGBoost & RFs.

discuss

hashta|2 years ago

To both questions above, just simple averaging of the logits (classification) or raw outputs (regressions) usually works well. If I had to guess why people don't use this approach often in kaggle competitions is the relative difficulty of training an ensemble of NNs. Also, NNs are a bit more sensitive to the type of features used and their distribution relative to decision trees (DTs).

Ensemble models work well because they reduce both bias & variance errors. Like DTs, NNs have low bias errors and high variance errors when used individually. The variance error drops as you use more learners (DTs/NNs) in the ensemble. Also, the more diverse the learners, the lower the overall error.

Simple ways to promote the diversity of the NNs in the ensemble is to start their weights from different random seeds and train each one of them on a random sample from the overall training set (say 70-80% w/o replacement).

padthai|2 years ago

Which kind of ensemble? Because it cannot be as easy as a voting meta model of nn with same architecture/hyperparametres right?

knightoffaith|2 years ago

Is this true? I hear that XGBoost tends to win Kaggle competitions for tabular data - how come you don't see NN ensembles winning instead? Or do they?

malshe|2 years ago

I would like to know this too. From what I understand, it is XGBoost that rules Kaggle.

jhoho|2 years ago

Can you recommend a resource on this for the curious learner?