top | item 18771002

(no title)

elyase | 7 years ago

We have done extensive testing in the context of chatbot intent classification and in our particular problem nothing (including CNN, LSTM, fasttext plus LUIS, Watson and other proprietary classifiers) has been able to beat a simple linear model trained on char n-gram features.

discuss

eggie5|7 years ago

I've seen a rule of thumb that if the ratio of samples to words per sample is less than 1500 you prob don't have enough data for embeddings/cnn

minimaxir|7 years ago

Chatbot intent would be a good use case for a linear model, as a single word/ngram would have a high impact on the result (in contrast to advanced architectures which try to account for ambiguity/contradictions in documents)

briga|7 years ago

I've seen the same things in the models I've built. For basic intent classification simpler models seem to be more accurate, not to mention they train faster and require less memory. There seems to be a lot of emphasis on shiny complex neural network architectures, even when simple models work just fine.

FridgeSeal|7 years ago

> There seems to be a lot of emphasis on shiny complex neural network architectures, even when simple models work just fine.

It's resume-driven-development for data scientists.

I've never seen an interviewer impressed with the fact that a job was performed using not-deep learning, but say that you used deep learning (despite how spurious it might be) and they light up like it's Christmas.

helpme3|7 years ago

This isn't that surprising. I think the reason for this is that, even though the model is linear, the space of n-grams is so large that there usually is a line that separates any two classes.

laichzeit0|7 years ago

Anecdotally, I found the same thing. N-gram BOW is surprisingly difficult to beat.

thanatropism|7 years ago

I've been curious about what manifold learning (sp. now that we have UMAP) would do to that kind of workflow's performance.