Yes, this is silly. And while they address it, and admit that using that field alone basically results in perfect classification, they don't do the logical thing and give this whole exercise up as pointless. Instead they just break the Account Create Time up into individual features: "Account Creation Hour", "Account Creation Minute". Seriously?The reality is, the inclusion of that field in the metadata means that identifying a user from metadata is trivial and no interesting case for ML. In order to publish, they "degraded" the data until it was just interesting enough to be headline worthy. Insulting.
No comments yet.