top | item 21225370

(no title)

It’s really very problem dependent. I allude to a few low-hanging things in my post above: e.g. feature engineering. Just because neural networks have an easier time learning non-linear feature transformations doesn’t mean its good to ignore feature engineering entirely.

Possibly more important is to focus on the process for how you derive and apply model changes. You get some model performance and then what? Rather than throwing something else at the model in a “guess-and-check” fashion, be methodical about what you try next. Have analysis and a hypothesis going in to each change that you make and why it’s worth spending time on and why it will help. Back that hypothesis by research, when possible, to save yourself some time verifying something that someone else has already done the legwork on. Then verify the hypothesis with empirical results and further analysis to understand the impact of the change. This sounds obvious (it’s just the scientific method) but in my experience ML practitioners and data scientists tend to forget or were never taught the “science” part. (I’m not accusing you of this; it just tends to be my experience.)

Random search, AutoML, hyperparameter searches, etc. are incredibly inefficient at model development so they’ll rarely land you in a better place unless a lot of upfront work has been put in. For us, they’re useful for two things: analysis and finalization. For analysis the search should be heavily constrained since you’re trying to understand something specific. For finalization of a model before going into production, a search on only the most sensitive parameters identified during development usually yields additional gains.

discuss

No comments yet.