Ask HN: How to be my own genetic disease researcher for my partner?
262 points| thetwentyone | 4 years ago | reply
I feel like it would be impossible for a doctor to stay abreast of all of the possible links/data unless they focused very narrowly on a patient.
I'd like to try and fill that gap - look at the data and relay any potential links/causes to the providers.
We have the full genome in CRAM, CRAI, FASTQ, VCF, and TBI data - is there a way that me, a medical layman but well informed person could leverage this data to mine for possible matching genetic variants?
e.g. I have started finding genes associated with my partner's condition in the NCBI website and the ClinVar Miner (https://clinvarminer.genetics.utah.edu/variants-by-condition)
Is it sufficient to identify variants by searching for the SNP string (e.g. "rsXXXXXX") in the VCF file?
Are there "hacker's guide to genomic analysis" resources out there?
[+] [-] mattmight|4 years ago|reply
I'm happy to help.
I written down an Algorithm for Precision Medicine that abstracted the journey all the way from diagnosis to treatment:
https://bertrand.might.net/articles/algorithm-for-precision-...
My day job is now to help patients like your partner all day every day at the Precision Medicine Institute at UAB.
Feel free to reach out to us, and we'll be happy to craft research strategy and provide technical tips.
[+] [-] nextos|4 years ago|reply
1. Search for variants in that genome where the allele frequency is close to 0 in a very large population e.g. https://gnomad.broadinstitute.org/
2. Look into variant effects for those you prioritized in step 1 using https://www.ensembl.org/info/docs/tools/vep/index.html
Rare diseases are typically due to a coding mutation that alters the protein coding sequence in some significant way.
If you need help contact details are on my profile. I do this for a living at a university.
rsIDs are a minefield as they change often, there are synonyms and probably you won't have all loci properly annotated. Don't rely on that too much unless you really know what you are doing.
If it's not a rare disease, this gets quite more difficult. Also, depending on the whole genome sequencing platform you have used, many structural variants (e.g. deletions or insertions of large chunks of DNA) won't be easy to measure.
Other comments have suggested Promethease, which will give you a bit of help if it's not a rare disease (e.g. if it's an autoimmune one, it's good at imputing HLA and finding risk haplotypes).
My whole comment is a bit of an oversimplification, but I think these suggestions are a good starting point.
[+] [-] jnotwell|4 years ago|reply
I haven't worked on this problem, but others in my graduate lab did. If you're interested in a tool that automates some of this process (takes VCF as input; filters variants based on frequency; you'll need to map disease symptoms/phenotypes to Human Phenotype Ontology [1] identifiers), some of my former lab mates developed a web tool [2]: https://amelie.stanford.edu/submit
[1] https://hpo.jax.org/app/
[2] https://www.medrxiv.org/content/10.1101/2020.12.29.20248974v...
[+] [-] Roguedr|4 years ago|reply
My son was born 3 months ago with Poland syndrome. This came as a shock but it has also drawn me to look into the scientific literature.
While the common belief was that PS has no underlying genetic cause, there are papers suggesting that the may be.
Studying - I ran across many anomalies on my own body (his father), so minor that there were never considered relevant until now (I'm 40 and lead a normal life).
It would seem that on the right side of my body I have at least:
If the above is correct, this may be an opportunity (by studying my genome and my son's genome) to establish a link or a common cause for Becker Nevus Syndrome and Poland Syndrome - both fairly rare anomalies.Can you suggest who may be interested in studying this?
This has no value for me or my son, however the scientific endeavour may be of value for the future.
[+] [-] thedudejohan|4 years ago|reply
[+] [-] fatboy93|4 years ago|reply
I'd also love to help OP :)
[+] [-] xab31|4 years ago|reply
OP, how did you even get the sequence to begin with? I have a friend who has an immunodeficiency which is almost certainly due to a rare genetic disorder and want to do a very similar thing. Despite contacting his physician, fellow researchers, and even my institution's president -- with friend's full cooperation -- no one is willing to pay for it.
I'm at my wit's end to the point that I'm starting to think the only viable option is paying for it out of pocket, but it's not cheap.
A question you might want to ponder is: suppose you isolate the problem to a single missense/nonsense/truncation mutation in a protein that seems likely to cause the phenotype. How do you plan to use that information? In theory, there is gene therapy, but in reality, given how much effort I have had to go through just to get this fellow sequenced -- and I'm a PhD working in genomics with a lot of contacts -- creating a custom one-off gene therapy solution seems like it would be a very tremendous undertaking.
There is a very difficult problem here in that rare or "personalized" disease treatments are: A) not profitable, so drug companies have no interest, B) there are mountains of paperwork, IRBs, consent waivers, etc, involved in developing an experimental therapeutic, C) by definition you cannot do a proper clinical trial on a one-off, and D) it requires several different types of expertise to pull such a thing off. Sadly this means that it almost never happens, even though I suspect there are a lot of severe and lifelong genetic disorders which could be diagnosed and treated with technology available today.
Based on my experience so far, I suspect that even if you were to hand his physician very strong evidence that "the problem is caused by this specific single mutation", the response will be "OK, thanks". You should not make strong assumptions about them being able to take it from there. All this is based on the best-case scenario of it being a single variant in a coding region; if the disorder is caused by multiple variants at different loci, anything you find will probably not be actionable.
[+] [-] patall|4 years ago|reply
(I do not want to demotivate but please be cautious. Even the best analysts have 'only' around 50% case-solve rate. If it is an adult-onset-disease, chances can be lower as the disease mechanism in that case may not be 'consequential' enough to be naturally selected against)
[+] [-] rvnx|4 years ago|reply
Love your offer to help.
[+] [-] zoe4883|4 years ago|reply
Genetics is very hard, and in very good case you may get 10% correlation. Then you will have to convince specialists to chase this weak possibility...
Working on this will consume your time, and put you under stress. This energy could be spend on your partner instead. You will need a lot of energy it it progresses.
Also there is always a big chance of misdiagnosis. Simple stuff like food alergy can be mistaken for many illnesses. Perhaps best first action is to verify this diagnosis. Get second opinion. Or change environment to rule out common triggers.
[+] [-] 8as746fd4a5df|4 years ago|reply
[+] [-] thr0awa4|4 years ago|reply
Nothing wrong with research...if there are existing tools within reach (and it seems like there are) then it seems like it'd be interesting and possibly helpful to dive in. But I'd strongly encourage you to timebox it as a project so it doesn't grow into something unhealthy.
[+] [-] siva7|4 years ago|reply
[+] [-] adityaathalye|4 years ago|reply
Hunting down my son's killer: https://matt.might.net/articles/my-sons-killer/
You may also want to email him. Anecdotally, I believe the rare disease research community is small and willing to listen to outliers.
University department page: https://www.uab.edu/medicine/pmi/matt-might
(Edit: fixed urls, typos, grouping.)
[+] [-] gonehome|4 years ago|reply
This was it and seems relevant: http://www.cureffi.org/2019/04/29/financial-modeling-in-rare...
[+] [-] jostmey|4 years ago|reply
[+] [-] farresito|4 years ago|reply
- Annotate the variants with its frequency. You can do that with Ensembl. There's an official Docker image on their website that I suggest you use. You will have to run "./INSTALL.pl" to download some files (they call them caches) and then "vep -i /genome.vcf --af --max_af --af_1kg --af_esp --af_gnomad -o genome_with_freqs.vcf --cache".
- If you know a specific region of the genome you want to look at, you can use tabix to extract all the variants in it. For example: "tabix genome.vcf.gz chr16:82,624,969-83,802,640 > cdh13_variants" will extract all the variants in that specific region.
- Use igv to browse the variants visually.
In general, the tools you will come across aren't very intuitive and its CLI interfaces aren't good. Prepare to spend a decent amount of time to make sense of everything. Honestly, if you know of a family member of her who happens to have similar symptoms, I would say the best thing you could do is to sequence that genome and see where they overlap. In any case, read some studies, find the loci where the variants cause problems, and then extract the variants.
EDIT: On this last point: the cool thing about igv is that you can open vcf files and see only the variants; you can search for variants ("rs123456" in the search field) and it will show the variant and its surroundings; you can search for "chr16:82,624,969-83,802,640" and it will limit what's visible to that region; you can search for a gene and the search field will show you the region of the genome that the gene spans, which you can use for tabix later. You will often see in studies something like "variants in loci p13.12 of chromosome 16 were shown to have an effect in...". Right below the search bar you can see all those locis (p13.3, p13.2, etc.). Good luck! If you add your email in your profile, I will contact you.
[+] [-] thetwentyone|4 years ago|reply
[+] [-] dnuntius|4 years ago|reply
I strongly recommend that you start by consulting with multiple doctors until you find one or more who shows interest in your partner's case, understands probable causes, and demonstrates useful expertise in treatment. You may need to visit a larger hospital that is involved in research. Work the system. You are a recruiter.
Pick the primary doctor who will coordinate your partner's treatment. This doctor will be your partner's primary line of defense, and they will mentor you in any investigations you do. They will guide you through the maze of therapies, palliative care, research, social workers, and other forms of support. They will connect you to other specialists who can help.
Good luck!
[+] [-] robwwilliams|4 years ago|reply
[+] [-] Atal|4 years ago|reply
Another avenue might be a crowdsourced rare disease research organization like [2]
I have no relation to any of the above but read a book about the UDN that may be of interest to you: [3]
[1] https://undiagnosed.hms.harvard.edu/apply/
[2] https://www.researchtothepeople.org/
[3] https://www.goodreads.com/en/book/show/53317420-the-genome-o...
[+] [-] rancar2|4 years ago|reply
[+] [-] gardenfelder|4 years ago|reply
[+] [-] Knufen|4 years ago|reply
[+] [-] mbreese|4 years ago|reply
I do something very similar in a research lab, and while it’s possible to make decent headway in this without much training, there are dragons all over the place.
For example, you didn’t mention which reference genome was used for the alignment/variant calling. Unless you use the right version for annotation, you’ll just get junk annotations that won’t make sense. I’ve only seen a couple of comments mention this.
If you don’t have the background, you might also need a crash course in human genetics, inheritance, molecular biology and variant functional prediction. You don’t need to become an expert, but you will need a working knowledge so that you know which variants to ignore. There really should be a small handful that would potentially make sense as a causal variant.
If the condition is sufficiently rare, you may not find clinical annotations, so be prepared to look a little deeper.
Best of luck.
[+] [-] fakegenomics123|4 years ago|reply
As far as discovering new associations or causal relationships from a single WGS, you are probably not going to have any luck there.
[+] [-] tejtm|4 years ago|reply
For more just start learning the tools... I have not checked in on them for years now but "BioStars Handbook" was up & coming
[] https://github.com/webyrd/mediKanren [] https://biostar.myshopify.com/
[+] [-] teekert|4 years ago|reply
You may find something, you may not, a lot is unknown and regions outside of the genes may be affected and even the cause of the phenotype, but we still understand very little of this.
Depending on where you live, genomic counseling is free and trio sequencing is usually part of it.
This is not really my expertise (more in oncology) but feel free to ask more questions.
If you have BAM (or CRAM + reference genome) files for parents and your partner, you could download a trial of VarSeq [0] to do a more GUI based analysis of the results.
"Is it sufficient to identify variants by searching for the SNP string (e.g. "rsXXXXXX") in the VCF file?" If the variants have been associated with the same phenotype as your partner's, then yes, it is interesting. If there is no phenotype, perhaps you can track down the source publication and try to talk to the authors.
There are probably groups online with people in the same situation, try to find them, they can probably help you a lot more.
[0]: https://www.goldenhelix.com/products/VarSeq/index.html
[+] [-] teekert|4 years ago|reply
Admittedly, your situation is different. Still, your partner may need you more as a supporting, fun, optimistic person rather than the miserable piece of human you can become from a bottomless rabbit hole like genomics, where the answer to your partner's problem may forever seem like "almost within your grasp".
[+] [-] teekert|4 years ago|reply
[+] [-] carbocation|4 years ago|reply
https://www.cureffi.org/about/
Sonia: https://www.broadinstitute.org/bios/sonia-vallabh
Eric: https://www.broadinstitute.org/bios/eric-minikel
They are also hiring: https://broadinstitute.wd1.myworkdayjobs.com/broad_institute...
[+] [-] heuermh|4 years ago|reply
In open source bioinformatics we strive for reproducible science, which can be difficult in a field with tons of different methods and tools. One approach is to use a workflow language such as Nextflow [0] and Docker/Singularity such that the entire analysis is reproducible, see e.g. [1].
There is a vibrant community around Nextflow workflows called nf-core [2] which has a rare disease workflow in development [3], come join our slack!
[0] - https://nextflow.io
[1] - https://github.com/brentp/rare-disease-wf
[2] - https://nf-co.re
[3] - https://nf-co.re/raredisease
[+] [-] wfhpw|4 years ago|reply
[1] https://www.promethease.com/
[+] [-] thetwentyone|4 years ago|reply
- Of course I want to spend time with my partner and don't see this as the "way to fix everything". - The literature surrounding my partner's disease calls it a "rare disease", but the number of patients in the US are in the 10's of thousands. I'm not trying to find a new associated gene/SNP with the disease, just reference against what research has been done by others. - The diagnosis is FSGS, and there is a history (since childhood) of high cholesterol.
[+] [-] tsol|4 years ago|reply
One way to easily find good research papers about it, is to go to the Wikipedia page and looking at the citations. They tend to cite the larger summaries, and they'll sometimes be available on pubmed-- if not you may need to look at scihub for the full text. That's a good introduction to the research paper side of information
One thing I've also heard is if your email the author on the research paper, they'll often send it to your for free. While distribution is through expensive journals, researchers can and often want to distribute them freely to intetested parties. Just another alley to search
[+] [-] a-dub|4 years ago|reply
would recommend trying to find supervision from an expert rather than just diving in alone. every field has its nuance.
[+] [-] shubb|4 years ago|reply
Maybe you could find researchers working on his topic (your disease or the generic problem of identifying causal mutations for a disease) and pay them to work on your case?
You might fight your own parking ticket but you'd get a lawyer for your murder defence...
Computer people get paid a hella lot more than university medical research people, especially those outside big cities or in Europe or Asia. It's more efficient to work hard at making money and hire a few experts.
[+] [-] AnthonBerg|4 years ago|reply
I'm in a not too dissimilar position, although a better known one.
It wasn't clear to me until recently what an astounding amount of good scientific results exist out there that are accessible on a device that's probably in your hands right now.
It's also been a surprise how useful Twitter is. Find good scientists that have real responsibilities toward the truth, are doing sound research, and talking on twitter to try to get the dialogue going. This kind of person is a very very useful link to the extant knowledge. And there's A LOT of it.
Some subreddits are also surprisingly deep.
It's a question of separating the wheat from the chaff. But it's possible. It is possible. You can do this.
[+] [-] fastaguy88|4 years ago|reply