Open Sourcing My Personal Medical Record

[+] howderek|8 years ago|reply

Me and many others have done this for a long time. The Harvard Personal Genome Project [1] is a large open database of people's genetic and phenotypic information. Here is my profile: https://my.pgp-hms.org/profile/hu1247AF

You can add your Personal Health Record to it.

[1] http://personalgenomes.org/

[+] abetusk|8 years ago|reply

The Harvard Personal Genome Project is great (I'm a participant) but there are some other projects that are complementary as well, such as Open Humans [1] and Open SNP [2].

[1] https://www.openhumans.org/

[2] https://opensnp.org/

[+] blaurenceclark|8 years ago|reply

Ah awesome!!!! Shoot me an email would love to chat more about that. The one thing that I wanted to offer that's different than those other sources is if I was an engineer who wanted to see what working with EMR data was going to be like I can't find an EMR export online anywhere (CCDA and raw notes) for using ML and NLP to analyze.

[+] jpobst|8 years ago|reply

I have a friend who is a medical researcher and it definitely seems they are stuck in the past.

In order to study something, he has to:

* Come up with a hypothesis that X may cause Y

* Request access to data about that hypothesis

* He is only given the data regarding his hypothesis

* He can then study whether his hypothesis has merit or not

We should be dumping these whole datasets into machine learning and having computers give us potential links to explore. Obviously there will be plenty of things that turn out to be unrelated, but it's also very likely the computer can find links that a human would not have considered.

I don't see it changing any time soon in the US, but I suspect other countries with this data will use it, and we'll find the next generation of medical breakthroughs no longer come from the US.

[+] specialist|8 years ago|reply

Every usage of every bit of medical data requires patient consent.

The potential for abuse is not hypothetical.

While I was implementing medical information exchanges, every single participant considered patient data to be their own, to be used as they wish. Our (grand)parent company, a lab, was negotiating with Microsoft, Google, pharmas, etc. Each was trying to figure out how to monetize it. For example, targeted ads.

The C (executive) level players mocked HIPAA and the other (meager) patient and consumer protections the same way they mocked Sarbanes-Oxley, environmental protections, financial reporting requirements, etc. If you think Google and Facebook are bad...

---

My data, all that is known about me, is my identity. It's me.

At the very least, if someone's going to profit from my data, I want my cut.

[+] mattjack|8 years ago|reply

I agree with kharms

>You're describing P-value hacking

Here's an example of what can happen when you take a huge corpus of data and throw an equally huge number of hypotheses at it to see what sticks: https://io9.gizmodo.com/i-fooled-millions-into-thinking-choc...

tl;dr: he "proved" chocolate causes weight loss by comparing chocolate- and non-chocolate-eaters on a very high number of health indicators.

That also introduces the multiple testing problem: https://www.wikiwand.com/en/Multiple_comparisons_problem

The more statistical tests you run against a set of data (EDIT: the more variables you test against a dataset), the higher the chance you get a statistically significant result from random error alone.

[+] kharms|8 years ago|reply

>We should be dumping these whole datasets into machine learning and having computers give us potential links to explore.

You're describing P-value hacking, thus named because hack scientists use this technique to publish papers about nonsense.

[+] troyastorino|8 years ago|reply

Sadly, even countries with universal healthcare systems don't have universal health informatics systems (the NHS is a prime example — they spent £12B trying to build an integrated system [1]). Lots of countries attempt, including the US — HIPAA was actually originally about data portability [2], and we just spent another $40B [3]. Thus far only smaller countries have had success with integrated health IT systems [4].

[1] https://en.wikipedia.org/wiki/NHS_Connecting_for_Health [2] https://en.wikipedia.org/wiki/Health_Insurance_Portability_a... [3] https://en.wikipedia.org/wiki/Health_Information_Technology_... [4] https://en.wikipedia.org/wiki/Healthcare_in_Denmark#eHealth

[+] diegoprzl|8 years ago|reply

Well, that's the way it's meant to be. Exploratory analysis doesn't have the same purpose as hypothesis testing.

[+] blaurenceclark|8 years ago|reply

100% agree I hope we can get there!

[+] voicedYoda|8 years ago|reply

Thank you for sharing this. We (I) founded a company to help with several niche aspects of healthcare and the bureaucratic issues faced by administrations. We are finding success with data transactions, and while there are some companies out there who work really hard to make transaction engines, it's not very efficient, very expensive, and doesn't benefit the consumer at all.

My past experience as a software developer was, "Give me all the datum, and tell me what you need, then I'll make it work." I even worked for a very large EMR (probably the biggest on the planet), and getting a patient record out of their system is a nightmare, even though the foundation of their application is the patient record.

I'd love to converse more about what you're building, as we capture many unstructured documents and are now using ML to grab details out of these and match to criteria.

[+] blaurenceclark|8 years ago|reply

Absolutely please shoot me an email!

[+] awinter-py|8 years ago|reply

We should (as a society) consider open sourcing every medical record.

Medical privacy is ethically tricky. It (1) protects bad doctors, (2) makes it harder to develop treatments, (3) makes it hard for consumers to shop intelligently.

Medical privacy would be useful when negotiating cost of coverage with your insurer, but they have a contractual right to demand your complete medical record.

The best arguments I've heard for medical privacy are (1) you might not get a job if you're sick, (2) shame factor could prevent people from going for treatment and (3) you may not get a date if you have, say, herpes. (#3 is true but not necessarily a strong point from a social standpoint).

[+] codemac|8 years ago|reply

Your #1, #2, and #3 are all the same thing, but I think it's hugely important:

Medical records can show all kinds of markers about your past / current behavior that let people paint pretty horrible assumptions about eachother.

Type 2 Diabetes? Man you must eat poorly.

Herpes? You must have gotten from being promiscuous and risky

Depression? Must not be able to deal with the shit that is real life.

Hormone therapy? Dental issues? Pain killers? Allergies? I mean the list is almost as long as the list of all medical issues that people.

Just about every medical condition, people paint with behavioral moral/ethical judgement which is almost entirely unfair. I think medical privacy is hugely important for society as we currently are, and losing it would not change these effects, but instead increase the ease to discriminate against them.

[+] pavel_lishin|8 years ago|reply

Herpes. Mental illness. A history of suicide attempts. HIV+ status. The fact that you weren't born with your current gender. The fact that you've miscarried three times, and are currently pregnant. The fact that your child has fetal alcohol syndrome.

All super fun facts that people would love for friends, coworkers and strangers to be able to find out.

I understand that there are good arguments for releasing medical data, but this is just the "if you have nothing to hide, what are you worried about?" argument.

[+] sweden|8 years ago|reply

Medical records can also be used as identification since they contain data that is unique only to you.

http://www.reuters.com/article/us-cybersecurity-hospitals-id...

It is very easy for the typical software engineer to come up with the brilliant idea of open sourcing everything without thinking of any of the consequences. But the real world is much more complex than that.

[+] comboy|8 years ago|reply

Simple solution. Make medical records privacy opt-in.

Most people won't bother (strong default effect), so lots of data for research, and those who care can still can have their privacy. It won't exactly be a random sample, but it should still be better than what's available currently.

[+] maxerickson|8 years ago|reply

The current law in the US is that insurers can't consider medical history (they can ask your age and whether you smoke).

Looks like that has a fair chance of changing though.

[+] roywiggins|8 years ago|reply

Do you really want to out every trans person who has told a doctor about it?

[+] willpearse|8 years ago|reply

This is a risky thing to do when the patient's name is attached to it. Insurance companies, salesmen, etc., could do quite a lot with such information.

I whole-heartedly support the general idea, and making a centralised database of things like this would be great. Such a database would probably make it easier to anonymise the data as well.

[+] kwhitefoot|8 years ago|reply

Even without the patient's name attached it is easy to identify people because of the necessary metadata in the record. If you expect to get useful information about, for instance lung disease the record will have to contain information about exposure to likely causes, age, occupation, region of the country (possibly town), sex. It will also contain marital status, whether one has children, drinking and smoking habits, weight, ethnicity.

This is pretty close to unique, just like a browser finger print.

See for instance: http://randomwalker.info/publications/no-silver-bullet-de-id...

[+] herman5|8 years ago|reply

Anonymize then let the flood gates open.

[+] blaurenceclark|8 years ago|reply

Author here, If anyone has any questions about dealing with the medical system or about clinical trials ask away!

[+] kiddico|8 years ago|reply

I'm guessing you're the author then?

I'm just making my way out of a course called Health Informatics. Most of what we've done is look at HIPAA, and the standards that make sending patient info from one hospital to another possible. In general the whole situation in a mess. I understand the purpose of not sharing identifiable data with the world, stops people from targeting people because of their conditions. But we have a wealth of information that's been made effectively useless from a research perspective.

this isn't much of a question, just wanted to express my frustration with the whole thing as well. that said I've got a lot of respect for your mission, and the balls required to publish your otherwise HIPAA protected info.

[+] troyastorino|8 years ago|reply

Hey Brian, it's really great that you're doing this :)

If there's more to your medical history that you want to track down, or you want to get your data transformed into a structured format, you should reach out to us at PicnicHealth and we'll see what we can do.

[+] herman5|8 years ago|reply

What was the actual process of acquiring your entire medical record? My understanding was that this information can be highly fragmented depending on the number of different places one has received medical treatment.

[+] JoshMandel|8 years ago|reply

Thanks for this! Have you considered sharing DICOM files, too (i.e. the actual images from your MRI, in addition to the reports)? If so, what went into the decision not to include these?

[+] tranv94|8 years ago|reply

I'm really confused the purpose what this article's purpose is. You first talk about the issues with clinical trials then you throw in a tidbit of you just feeling like putting your medical records on public because you couldn't find many open medical records?

[+] blaurenceclark|8 years ago|reply

Good feedback, when I initially was diagnosed and wanted to start working on this problem I had no idea what a medical record looked like so I didn't know what the data I'd be working with looked like which can be tricky to do a data project without knowing the data structure :) I just wanted to share mine in case anyone wants to tackle something medical record related in the future they'll be able to see what the data sets they'll be working with may look like!

The clinical trial bit is our specific use of that data

[+] brynlewis|8 years ago|reply

You can view the CDA (xml) documents here: http://intelsoft.com.au/challenge/index.htm

CDA are xml document conforming to a schema specified for medical documents.

[+] IdleChris|8 years ago|reply

Was there a link to the actual MRI file and not just the JPEG slices of the damaged bone?

[+] blaurenceclark|8 years ago|reply

Unfortunately my provider doesn't have that for full download for me, I need to drive there and pickup a CD and haven't had the time to do that yet. Plan to get the full file soon!

[+] Kinnard|8 years ago|reply

I bet this at scale, like a github for medical records, would be revolutionary.

[+] ipunchghosts|8 years ago|reply

picnichealth could easily add a "opt in" option whereby patients can opt their data into to trials. Institutions could pay for access to all this curated data to use for testing or recruitment of patients.

79 comments