Whole-genome sequencing (WGS) is becoming increasingly common for investigating the full spectrum of genetic variation, associated with relatively common complex diseases (e.g. obesity, cancer, autism spectrum disorder), but the challenges in interpreting data are considerable. Remember that complex diseases –– just like most instances of drug efficacy or toxicity, or genetic differences in response to an environmental toxicant –– are multifactorial traits (representing contributions from hundreds, if not thousands, of genes, plus epigenetic factors, plus environmental effects).
The primary motivation for a WGS study is to understand whether structural, rare, and de novo DNA variants (also called nucleotide changes; single-nucleotide variants, SNVs; mutations) –– in the noncoding genome –– contribute to the cause of the disease etiology –– in addition to the more well-understood contribution from mutations in the coding genome. Keep in mind that “the human (i.e. coding) exome” comprises ~180,000 exons, which represents ~30 megabases (Mb, or million bases), or about 1% of the total genome. This means the noncoding genome represents the remaining 99% of the human genome..!! Nevertheless, DNA variants in the human exome are believed to harbor ~85% of the mutations that exhibit large-effect contributions to disease.
Authors [see attached study & editorial] provide the first serious attempt to establish a framework for enrichment analyses of rare noncoding variation in WGS studies of common complex diseases. They evaluated, by way of WGS, rare and de novo noncoding SNVs, insertions/deletions (indels), and all classes of structural variations (e.g. copy-number variations, CNVs; large rearrangements; repetitive types of DNA). Integrating genomic annotations at the level of nucleotides, genes, and regulatory regions –– authors defined 51,801 annotation categories..!!
Analysis of 519 autism spectrum disorder (ASD) families (i.e. unaffected vs affected members) did NOT identify any particular association with any categories –– after authors corrected for 4,123 effective tests (this contradicts a number of previously published ASD studies) Without appropriate correction, biologically plausible associations are observed –– but in both cases and controls. Despite excluding previously identified gene-disrupting mutations, coding regions still DO exhibit the strongest associations with ASD. Thus, in autism, the contribution of de novo variats in the noncoding region is probably modest, in comparison to that of de novo coding variants. Robust results from future WGS studies will require even larger cohorts and comprehensive analytical strategies that consider this substantial burden of multiple-testing. Prediction of risk of ASD is extremely unlikely, whereas identification of future drug targets for treating ASD remains a possibility.
Nature Genetics May 2o18; 50: 727–736 & 635–637 [News’N’Views editorial]