top | item 23525808

(no title)

The rules seem pretty clear that consent is required from any persons appearing in any external datasets that are required. The winners scraped data from Youtube videos so I am not not sure the issue is.

The more worrying takeaway is that the winners scraped videos from people who clearly had no intention of their videos being used for a deepfake detection algorithm. Yet they did not think of the ethical considerations of using that data (did everyone in the video even have a say in the video being uploaded?). I think Kaggle disqualifying the team is the right move (even if it's a painful one for the winners).

discuss

TheAdamAndChe|5 years ago

The article states the videos used a Creative Commons license that allowed for commercial use. It is an extremely liberal license that does not state "free for commercial use except for when used with facial recognition."

KingOfCoders|5 years ago

For people in a video you need a model release from them. This is also a mistake many people make, they use Creative Commons licenses and think they are safe. A picture or a video needs model releases for the people in the picture (several exemptions apply).

quietbritishjim|5 years ago

But that Creative Commons licence was issued from the copyright holder of those videos, not the people in them. The people in those videos may not even have agreed to appear in the video if they were in a public place (the relevant legal term, at least here in the UK, is "reasonable expectation of privacy"). So if Kaggle requires people in the videos to consent taking part then that consent cannot be inferred from that licence.

What's more, if that consent is not legally required (there's a heavy "if" in this sentence, IANAL so I do pretend to know whether it's required e.g. under GDPR, but let's assume for a moment that it's not) then Kaggle are still perfectly at rights to ask for that permission to qualify for their competition. After all, it's their competition, and it's totally reasonable for them to set an ethical criteria that's even higher than legally required.

unknown|5 years ago

[deleted]

reedwolf|5 years ago

Yeah, with $1 million at stake, I can't believe this team of really smart people made such an incredible blunder.

The whole reason Facebook launched this challenge was to try and bury the bad PR over their data practices. If people in the external datasets had complained about the unauthorized use of their faces in the winning solution, it would've been pretty embarrassing for FB.

nl|5 years ago

Note that isn't part of the rules. It's part of the "Winning submission documentation requirements" which is a separate document and wasn't mentioned at all on the "external data" Kaggle thread, which had Kaggle moderators explaining the rules.

Documentation requirements are pretty standard in Kaggle competitions, and usually cover having to supply your code, and maybe write a blog post about it. I've never seen one that had major rules in it.

nostrebored|5 years ago

I'm with you here. There are ethical concerns, legal concerns for productization, and overall this defeats the purpose of creating novel algorithms rather than a better trained model.

For instance, with the same scraping being used to train the deepfake GAN, would their model be more or less effective than a competitor model?

It seems like they won from a disparity in data not an innovative technical approach.

oars|5 years ago

It's much better they learn now by being banned from a competition rather than having a lawsuit filed against them in the future.

The correct decision was made.

sjg007|5 years ago

What if you took commercial video like a news broadcast vs youtube? Would that still be off limits?