top | item 29695013

Sequencing your DNA with a USB dongle and open source code

398 points| johntortugo | 4 years ago |stackoverflow.blog

176 comments

order
[+] m12k|4 years ago|reply
I'm really curious about what I could learn by getting my DNA sequenced, but I'm worried about my rights to not have it recorded and shared without my consent if I got someone else to do it for me - so any advance toward an affordable home test setup is very welcome.
[+] adabaed|4 years ago|reply
Imagine insurers refusing to give you a service due to your predisposition to certain diseases...
[+] dekhn|4 years ago|reply
Note that you are literally shedding identifiable DNA from your body at all times and a truly motivated adversary would have no problem obtaining enough sample material to do high quality sequencing.
[+] Method-X|4 years ago|reply
When I had 23andme sequence my DNA, I used a fake last name and pre-paid credit card.
[+] Gatsky|4 years ago|reply
It’s not exactly DIY but there are in theory ways to ‘encrypt’ your DNA before it gets sequenced. Something like amplifying/enzymatically modifying the DNA in a way that changes the sequence which you can undo computationally once you get the data back.
[+] biophysboy|4 years ago|reply
Its only valuable if somebody also interprets it for you, such as telling you whether you have a genetic predisposition for certain diseases.
[+] fragmede|4 years ago|reply
I don't know if this is the exact nanopore USB dongle used in the article, but this one is $1,000 for the base package, first released in 2014

https://store.nanoporetech.com/us/minion.html

https://www.extremetech.com/extreme/190409-minion-usb-stick-...

[+] cge|4 years ago|reply
Note that Oxford Nanopore seems to have very much a "sell the ink/razor/etc" business model with their devices: that $1,000 package comes with one flow cell, which is a consumable and costs $900. They're essentially giving the device away for free.

On some of their larger devices (eg, the PromethION), they've moved outright to a "we lend you the device for free, you buy the consumables" model.

[+] koeng|4 years ago|reply
Yep that’s the one. They update the flow cells over time. The bit they don’t tell you is the stuff you need, like a qubit, to properly run the thing.
[+] LinuxBender|4 years ago|reply
This is very cool. Are there by chance any associated projects that could evolve into something like 23andme but remain entirely within a private network meaning that the data is entirely in the hands of the individual?
[+] mylons|4 years ago|reply
yes. if you wanted to annotate your genome you could “easily” do it on your brand new macbook (this is ram intensive, you probably need 32G). you’d need a reference genome, like https://www.nist.gov/programs-projects/genome-bottle

then you’d need a program like bwa http://bio-bwa.sourceforge.net/ to map your data.

then use https://samtools.github.io/bcftools/howtos/variant-calling.h... or something else to produce variants from the mapping results.

then compare your resultant vcf file to something like dbSNP: https://www.ncbi.nlm.nih.gov/snp/

at this point you can start generating a raw version of a 23andMe report.

[+] glofish|4 years ago|reply
Alas the information presented is an over simplification of the process.

To actually sequence DNA with this USB thingy you need to prepare a so called sequencing library - and for that you need a fairly well equipped lab - expensive reagents and years of practice and skill ... a mid level biology Ph.D can prepare these ...

in addition the flowcell sold by Oxford Nanopore often malfunctions and the whole run is a bust ... (behaves like this since 2014 ... so no, the technology does not seem to improve a whole lot)

[+] inglor_cz|4 years ago|reply
DNA sequencing bugs me quite a bit.

On one hand, I would love to learn something new about my body.

On the other hand, what if the results tell me that I am predisposed to some horrible untreatable disease? Will I spend the rest of my days observing every little pain or discomfort and thinking "is this IT?"

[+] monopoledance|4 years ago|reply
I think you would have to two scenarios at hand:

1. A completely genetically determined disease; a rare 100%-going-to-happen deal. (Which you would probably know about already, because your mother, or grandfather died from it...)

2. Some significant, but abstract risk modification.

With 1., you would know, you will get sick/die some time soon in the future, allowing you to live your life accordingly, die without regrets, prepared and so on. You can take that into consideration when planning for a family, taking job offers, procrastinating on the good life with work and retirement plans. Burn bright.

With 2., there is a very, very high chance lifestyle choice influence the stated risk, as obviously not everybody who got the polymorphism gets sick. So you can get your ass up, exercise, quit smoking and drinking, reduce stress, get regular check ups, ..., and avoid getting sick or reduce the impact/progression, in case you do.

I think, logically, knowing is always better than not knowing. But I understand how anxiety does tell a different story.

[+] wallacoloo|4 years ago|reply
well, build a whitelist of the conditions you are interested in knowing. then just run the report through a sed filter so that it strips out all the information you’re not interested in. destroy the original report. problem solved: infohazards avoided.
[+] wombatmobile|4 years ago|reply
Knowing something about your prospects doesn't doom you to negative thoughts. In fact, the way the human mind works is often the obverse.

"Inaction breeds doubt and fear. Action breeds confidence and courage. If you want to conquer fear, do not sit home and think about it. Go out and get busy." --Dale Carnegie

"You gain strength, courage and confidence by every experience in which you really stop to look fear in the face. You are able to say to yourself, 'I have lived through this horror. I can take the next thing that comes along.' You must do the thing you think you cannot do." --Eleanor Roosevelt

"Fear is the path to the Dark Side. Fear leads to anger, anger leads to hate, hate leads to suffering." --Yoda

"The brave man is not he who does not feel afraid, but he who conquers that fear." --Nelson Mandela

"Nothing in life is to be feared. It is only to be understood.' --Marie Curie

"The key to change... is to let go of fear." --Roseanne Cash

"He who is not everyday conquering some fear has not learned the secret of life." --Ralph Waldo Emerson

"We should all start to live before we get too old. Fear is stupid. So are regrets." --Marilyn Monroe

"Fear keeps us focused on the past or worried about the future. If we can acknowledge our fear, we can realize that right now we are okay. Right now, today, we are still alive, and our bodies are working marvelously. Our eyes can still see the beautiful sky. Our ears can still hear the voices of our loved ones." --Thich Nhat Hanh

[+] nomercy400|4 years ago|reply
How about affinities to possible health issues, which could be avoided if you started now and not in 20 years?
[+] Cyclical|4 years ago|reply
Nanopore sequencing is a really interesting technology. It utilizes fundamentally the same apparatus as a Coulter Counter [1], which is a general method of counting and sizing arbitrary particles that's frequently used in flow cytometry. Applying it to sequencing by drawing unwound DNA through the pore was a really excellent logical leap, and we're only now starting to see the benefits of even though it was first ideated over 30 years ago.

[1] https://en.wikipedia.org/wiki/Coulter_counter

[+] a-dub|4 years ago|reply
the nanopore units are awesome! although if i recall, most of the device is a replaceable one time use consumable and the cost of that consumable is quite expensive (at least hundreds, if not thousands).

when i looked i was interested, but was turned off when i saw that the cost far outstripped commercial sequencing services.

[+] GekkePrutser|4 years ago|reply
I don't see any reference to the "USB dongle" mentioned in the title. I was thinking this would be some cool thing you could do at home.
[+] kingcharles|4 years ago|reply
So, how long before I can take my DNA "ROM" file and boot it in an emulator that would allow it to grow?
[+] twotwotwo|4 years ago|reply
A researcher mentions using a compact index based on the Burrows-Wheeler Transform to fit things in less memory compared to using a huge hashtable.

I see open-source implementations of BWT-based indexes (FM-Index/FMtree) out there. Out of curiosity, does anyone know of anything using BWTs for compact indexes in more everyday uses (like full-text search), or alternately reasons it doesn't really work outside the genome-alignment use case? Likely it only 'pays for itself' if you really need the space savings (like, it's what makes an index fit in RAM) or else we'd see it in use more places. It'd still be kinda neat to actually see those tradeoffs.

[+] jltsiren|4 years ago|reply
There was some interest in the information retrieval research community 10-15 years ago, but I don't think anyone ever found a good application for it. Some limitations of the BWT always got in the way.

The BWT sees strings as integer sequences. Either "ABC" and "abc" are two unrelated strings, or you normalize before building the index and lose the ability to distinguish between the two.

Search proceeds character-by-character backwards, jumping arbitrarily around the BWT using the same LF-mapping function as when inverting the BWT. You get cache misses for every character.

BWT construction is expensive, because you want a single BWT for the entire string collection. There is a ridiculous number of papers on BWT construction, as well as on updating and merging existing BWTs, but the problem has still not been solved adequately. If your data is measured in gigabytes, you can just pay the price and build the index, but a few terabytes seems to be the practical upper limit for the current approaches.

You can of course partition the data and build multiple indexes, but then you have to search for each pattern in each index. There is no way to partition the data in a way that different indexes would be responsible for different queries.

[+] dekhn|4 years ago|reply
Folks are free to analyze my genome, https://my.pgp-hms.org/profile/hu80855C

Last time it was analyzed the conclusion was that there was nothing actionable.

[+] zmmmmm|4 years ago|reply
Have you ever encountered any insurance implications from it? eg: questioned whether you have ever had a genomic test etc. and had to answer yes and then them wanting to see results?

I guess in your case where nothing actionable is found it's benign. It will be the cases where there are risk factors for late onset things - cancer, diabetes, heart disease etc. where it would get sticky.

[+] lend000|4 years ago|reply
How does it get the DNA to go through the hole?
[+] Cyclical|4 years ago|reply
Initially, the DNA is brought near the pore through diffusive (brownian) motion + any small attraction it'll have to the membrane. Close to the pore it uses a combination of the electrophoretic and electro-osmotic effects to draw the DNA molecules through. The application of an external magnetic field will cause the charged DNA molecules to migrate along the field (electrophoresis). This is independent of the fluid, and happens to any ions under voltage. The electro-osmotic flow, on the other hand, is a motion of the fluid itself, pulling the DNA molecules along with it. EOF is a really interesting phenomenon which is caused by the interaction between the surface chemistry (vis-a-vis charge distribution) and the concentration gradient of charge carriers in the fluid. I'd recommend Fundamentals and Application of Microfluidics by Nguyen et al if you're looking for a good primer on electrically induced flows in microfluidics.
[+] wombatmobile|4 years ago|reply
> Why not make the software into a proprietary product? ... There’s such a race there that it’s hard to commercialize the software for the long term.” Schatz continues, “Plus our work is largely funded through government sponsored grants, so this is one of the important ways for us to give back to society.”

In some people's thoughts, making a better society is the first and most obvious thing to do with technology like this, not an accidental consequence of inconvenience. Fortunately, enough of those people are active in the world to make Main Street different to Wall Street, at least sometimes.

[+] klmr|4 years ago|reply
It’s a weird quote anyway since there is commercial, proprietary software for DNA sequence analysis. Just a few examples of companies in this space are Sentieon, Edico (acquired by Illumina) and Parabricks (acquired by Nvidia). And Michael knows this (they’re sufficiently well known, and his own research laid some of the earliest foundations that Parabricks would ultimately build upon) so I’m assuming the quote was taken out of context or he was talking specifically about his own lab.
[+] thadk|4 years ago|reply
Maybe at our local library we should be able to check these nanopore sequencers, or even other devices like simple & robust medical devices like handheld ultrasound devices that plug into iPad's?
[+] luxpir|4 years ago|reply
There is a 3+ year old London-based project, partnered with an established genome sequencing company, doing something highly interesting.

They sell swab kits directly, or via NFT purchase, for ~$500 for a 30x near complete sequencing (that's 30 passes for over 99.9% vs 0.2% for 23andme et al). The results are stored in an encrypted AMD SEV-E vault to be accessed by big pharma or individuals, only for specific markers, in exchange for the $GENE token paid directly to the genome owner. Figures touted are $50-80 per request. This token is burned as kits are sold, can be staked, offers rewards like DAO membership, can be gifted to charities researching specific diseases in various populations. It can act as a form of UBI in unbanked populations and puts your DNA back in your control.

To me it's the best use of web3 tech I've come across, so disclaimer, I am invested and a DAO member, but it's early in the project still. They are not quite ready for mass marketing. They are moving over to Polygon for very low transaction fees in January, will be launching the first joint NFT/kit sale (the next season might include personal genetically generated art) to fill the vaults with 10k sequenced genomes. They are over half way already through work with charities, but that is the magic number before big pharma can start making queries. Right now though they are quietly building and preparing before marketing plans kick in later in Q1.

Take a look at https://genomes.io where everything is explained in more detail, the team are presented and the tokenomics set out.

TL;dr - for $500 right now you can get your entire genome sequenced, stored in a vault to earn you passive income, if you agree to each query. But wait for the NFT vs buying directly, it will have more perks.