top | item 21526576

(no title)

codesushi42 | 6 years ago

This. Exactly this. No sophisticated tokenization. No interesting architecture using attention. And the author is completely clueless about overfitting... and even cross entropy loss. He could have gotten better results just using a bag of words approach.

But this ends up on frontpage anyway. Welcome to HN.

discuss

order

objektif|6 years ago

What tools would you use to detect overfitting in this case and in general?

codesushi42|6 years ago

My brain.

You will overfit an NN trained on only 1000 examples.

Also a simple train/test split will tell you that. But the author failed to take any time to learn the basics before spewing out this drivel.