Possible experimental and computational (PCR-amplified) artifacts when examining HPV in 135 cervical cancer samples

Earlier I shared a recent study by Hu et al. [Nat Genet 2o15; 47: 158-163], reporting 3,667 nuclear-integration events of human papillomavirus (HPV) in 135 cervical cancer samples, identified on the basis of hybrid human-viral DNA fragments. The (very interesting) correspondence identified 87% of those reported integration sites as likely experimental and computational artifacts. Because most results were based on the identified insertions, the finding of artifacts by Dyer et al. draws most of the conclusions of the publication of Hu et al. into question.

 Sites of artifacts more frequently feature microhomology than genuine sites, with the majority of genuine sites not featuring microhomology. As DNA fragments are PCR-amplified before sequencing, one frequently obtains multiple reads corresponding to the same original fragment. Hu et al. tackled this problem by filtering out duplicate reads and implementing a threshold based on read number. However, because this “filter” ignores the occurrence of sequencing and PCR errors, it is insufficient to completely exclude amplification bias. For example, filtering out identical reads among the reads shown in the figure [attached] still leaves multiple reads––because some contain mismatching bases.

 Dyer et al. filtered the reads on the basis of their mapped genomic position and found 72% of sites to be supported by single original fragments and, therefore, they were indistinguishable from ligation events of unrelated fragments during sample processing. As a computational control, they repeated the analysis––swapping the viral genome with the mitochondrial genome (mtDNA), which is physically separate from the nuclear genome (gDNA). Interestingly, they found that 0.024% (median, across samples) of mitochondrial reads corresponded to nuclear-mitochondrial hybrid fragments, closely matching the rate of nuclear-HPV hybrids (median of 0.017%) mapping to single-fragment loci.

 Therefore, if “single-fragment loci” were to be accepted as evidence, one would need to conclude that mitochondrial genomes integrate at a similar rate as that of viruses. Removing single-fragment loci left only 28% of the reported integration sites in mtDNA for further consideration. In summary, they found microhomology in: 72% of viral ligation artifacts, 80% of mitochondrial ligation artifacts, and 43% of genuine insertion sites.

Hu et al. reply to this criticism. I’ll leave the judgment (of who sounds more like they know what they’re doing) … up to you …!!

 Nat Genet Jan 2o16; 48: 2–4

This entry was posted in Center for Environmental Genetics. Bookmark the permalink.