(no title)
perturbation | 6 years ago
If I'm understanding your questions correctly, the main problems I see with this are:
- Using raw data instead of feature engineering (less of a problem given feature synthesis libraries like https://www.featuretools.com/ and other heuristic methods). I'd expect Google to do a good job of basic things like normalization of raw input features before training.
- Using features that it really shouldn't (if you just throw ML at your database for say, loan applications, then sensitive / personally identifying information can/will be used as features)
- Lack of insight / understanding as to what is driving the model. This can be partially overcome with post-training methods like LIME, Shapley values, etc.
I wouldn't expect predictions to be from a set of discrete values - if (say) predicting housing values and training a NN, the output should be continuous and based on the input features.
mritchie712|6 years ago
kmax12|6 years ago