top | item 20518208

(no title)

In my opinion, the most provocative point that this paper makes isn't just about general reproducibility issues or problems with comparing to a weak baseline — it’s that a number of these papers used improper methods to obtain their results in the first place.

For instance the NCF and MCRec papers tuned model parameters on the test set and the SpectralCF paper used a non-randomly sampled test set for evaluation.

That to me is even more surprising than their revelations that a well-tuned statistical baseline outperforms these models.

discuss

No comments yet.