Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk

These GEITP pages maintain the theme of gene-environment (GxE) interactions. Autism spectrum disorder (ASD) has often been included as one of our topics, because this fascinating complex disease is a multifactorial trait — that might include not only GxE interactions, but also epigenetics effects, endogenous (e.g. heart or kidney disease) influences, and perhaps even our microbiome. During the past decade in understanding the genetics/genomics of ASD, establishing de novo mutations — including copy number variants (CNVs) and point mutations that likely disrupt protein-coding genes — has been suggested as important contributory causes of ASD. However, when combined, all known ASD-associated genes explain only a small fraction of new cases. It is estimated that, overall, de novo mutations in protein-coding genes (including CNVs) contribute no more than 30% to the simplest, most-straightforward ASD cases (i.e. ASD symplex).

The vast majority of identified de novo mutations are located within intronic and intergenic regions (i.e. regions of the genome that encode no protein-coding genes). Thus, little is known, regarding their contribution to the genetic architecture (i.e. the underlying genetic basis of a phenotypic trait and its variational properties) of ASD or (for that matter) any other complex disease. ☹ Human regulatory regions show signs of negative selection — suggesting that mutations within these regions might lead to deleterious effects. Interestingly, studies of inherited common variants have also shown enriched disease association in noncoding regions; in fact, noncoding mutations that affect gene expression have been found to cause Mendelian diseases (traits caused by a single gene, or very small number of genes) — as well as being enriched during the development of cancer. Expression-dosage effects have also been suggested to underlie the link between CNVs and ASD.

Recently, parentally-inherited structural noncoding variants have been linked to ASD; in a small cohort of ASD families, some trends with limited sets of mutations have been reported. Likewise, despite the major role that RNA-binding proteins (RBPs) have in post-transcriptional (i.e. after DNA is transcribed into RNA) regulation, little is known of the pathogenic effect of noncoding mutations affecting RBPs (other than the effect of mutations in canonical splice sites). Therefore, noncoding mutations could be one of the most important causes of ASD, but no conclusive connection between regulatory de novo noncoding mutations (either transcriptional or post-transcriptional) and the etiology of ASD has been established [recall that DNA transcription leads to RNA, and messenger RNA translation results in protein formation].

Authors [see attached article] address the challenge of detecting contributions of noncoding mutations to ASD — with a deep-learning-based framework that can predict specific regulatory effects and the deleterious impact of genetic variants. Applying this framework to 1,790 ASD simplex families reveals a role for noncoding mutations: ASD probands (persons serving as the starting point for any genetic study of a family) harbor both transcriptional- and post-transcriptional-regulation-disrupting de novo mutations of significantly higher functional impact, compared with those in unaffected siblings. Further analysis suggests involvement of noncoding mutations in synaptic transmission and neuronal development and — taken together with previous studies — reveals a convergent genetic landscape of coding and noncoding mutations in ASD.

Authors demonstrate that sequences carrying prioritized mutations — identified in probands — possess allele-specific regulatory activity, and they highlight a link between noncoding mutations and heterogeneity in the IQ of ASD probands. Their predictive genomics framework is able to illuminate the role of noncoding mutations in ASD. Furthermore, this framework prioritizes mutations with high impact for further study, and should be broadly applicable to many other complex human diseases. 😊

DwN

Nat Genet June 2o19; 51: 973-980

This entry was posted in Center for Environmental Genetics, Gene environment interactions. Bookmark the permalink.