top | item 45630810

(no title)

Ovah | 4 months ago

That's a prevalent misconception even in the scientific community. Sure, each read has 1% incorrect bases (0.01). But each segment of DNA is read many times over. More or less 0.01^(many times) ≈ 0 incorrect bases.

discuss

optionalsquid|4 months ago

The author got less than 1x coverage for their efforts. To get the kind of coverage required for reliable base-calls, you need significantly higher coverage, and therefore a significantly higher spend

bonsai_spool|4 months ago

> That's a prevalent misconception even in the scientific community. Sure, each read has 1% incorrect bases (0.01). But each segment of DNA is read many times over. More or less 0.01^(many times) ≈ 0 incorrect bases.

That's true in targeted sequencing, but when you try to sequence a whole genome, this is unlikely.

optionalsquid|4 months ago

> That's true in targeted sequencing, but when you try to sequence a whole genome, this is unlikely.

Whole-genome shotgun sequencing is pretty cheap these days.

The person you are replying to doesn't give any specific numbers, but in my experience, you aim for 5-20x average coverage for population level studies, depending on the number of samples and what you are looking for, and 30x or higher for studies where individuals are important.

For context, coverage refers to the (average) number of resulting DNA sequences that cover a given position in the target genome. Though there is of course variation in local coverage, regardless of your average coverage, and that can result in individual base-calls being being more or less reliable

jakobnissen|4 months ago

I worked with Nanopore data about four years ago, and I found that that's mostly true, but for some reason at some sites, there was systematic errors where more than half of reads were wrong.

I can't 100% prove it wasn't a legit mutation but our lab did several tests where we sequenced the same sample with both Illumina and Nanopore, and found Nanopore to be less than perfect even with exteme depth. Like, out depth was so high we routinely experienced overflow bugs in the assembly software because it stored the depth in a UInt16.