Do we need a Human Data Project?

[+] JunkDNA|12 years ago|reply

There is a fundamental problem with opening up so-called anonymized data: usually the best, highest detail data can't be properly anonymized. Take fitbit data. You'd think that would safely be anonymous. But all I have to do is tweet a few times about my progress and I bet you can identify me. I might not have minded tweeting that I reached 1000 steps today. But now that innocuous tweet links all my "anonymous" data submitted to the commons directly me.

Most people don't realize how easy it is to use scraps of public info to identify people in data sets (the canonical example is here: http://arstechnica.com/tech-policy/2009/09/your-secrets-live...).

For research to be bioethically sound, people have to be able to give informed consent. Most people are ill equipped to get their heads around these ideas enough to actually be informed. I see this all the time with DNA data. Even exceptionally bright researchers can delude themselves into thinking that DNA sequence data isn't identifiable because you'd "need some other DNA linked to the person to identify them". As if that's going to be a problem in another few years.

I agree in principle it would be great to have vast troves of data to mine. I'm just not sure how to square that with existing laws and regulations.

[+] SteveArmstrong|12 years ago|reply

With the way the article compared it to donating organs, I figured this was a system for releasing information after death. In that case, the de-anonymizing problem is less of an issue. (Informed consent of this problem would still be needed of course)

[+] kumarski|12 years ago|reply

That makes sense. I guess the system would have to be very much about informed consent or filter through data that makes it too easy to trace down to individuals?

Nice ARS link.

[+] danielsiders|12 years ago|reply

This is one of the long term goals of Tent (https://tent.io). Store all your highly personal data somewhere you control it completely with the option to share it under certain conditions with others.

[+] dangoldin|12 years ago|reply

Pretty cool - thanks for sharing.

I've always wondered why this information can't be kept in your browser and be shared through that. That way the data is kept locally without even needing to rely on any 3rd party service.

[+] kumarski|12 years ago|reply

interesting......tent.io seems pretty cool. Have you used it before?

[+] k2xl|12 years ago|reply

Also interesting to note that President Obama announced recently that taxpayers received a crazy 800 billion dollar return from the 3.8 billion investment in the human genome project.

[+] wslh|12 years ago|reply

May be he's talking about the value of patenting DNA ;-)

11 comments