DNA damage during isolation and preparation can cause problems in the 1000 Genomes Project and other sequencing repositories

This study [attached] is perhaps not too surprising. And it’s something that everyone should be constantly on the lookout for. As in all scientific experiments, technical problems can lead to mistakes, even making it into the final published form/data set.

Mutations in somatic cells generate the heterogeneous genomic population, pointing out the genetic uniqueness of each individual. Of course, DNA mutations be beneficial (leading to enhanced ability to find food, survival and fertility) but also can lead to serious medical disorders. Although cancer is typically associated with somatic variations, advances in DNA sequencing indicate that cell-specific variants affect a number of phenotypes and pathologies. Authors [of attached report] show that mutagenic damage –– during DNA purification and isolation –– accounts for the majority of the erroneous identification of variants with low to moderate (1 to 5%) frequency.

Authors found signatures of damage in most sequencing data sets in widely-used resources –– including the 1000 Genomes Project, The Cancer Genome, and other supposiories. These data underscore the importance of DNA damage as a pervasive cause of sequencing errors. The extent of this damage directly confounds the determination of somatic variants in these data sets.

Science 17 Feb 2o17; 355: 752-756

