I can't explain it, but I help maintain TensorFlow Decision Forests [1] and Yggdrasil Decision Forests [2], and in an AutoML system at work that trains models on lots of various users data, decision forest models gets selected as best (after AutoML tries various model types and hyperparameters) somewhere between 20% to 40% of the times, systematically. It's pretty interesting. Other ML types considered are NN, linear models (with auto feature crossings generation), and a couple of other variations.[1] https://github.com/tensorflow/decision-forests
[2] https://github.com/google/yggdrasil-decision-forests
onasta|3 years ago
Bonus question: are the stats you're mentioning publically available?
coffee_am|3 years ago
Since the server doesn't work for all types of data, and probably folks that are experts in ML would do their own hyperparameter tuning, and custom models, this leads to the bias on the type of datasets that are compete.
But this share have been consistent over many months of various unrelated datasets, I believe.