The earliest genome-wide association studies (GWAS) immediately raised the question of what genes are the targets of the identified disease risk variants (single-nucleotide variants; SNVs), located sometimes inside a gene, but more often, located some distance upstream (or downstream) of the transcribed gene. In other words, a single-nucleotide alteration (SNV) some distance from the nearest gene MIGHT be influencing expression of THAT gene, or it might be affecting expression of some gene much further away from the SNV. GWAS have mapped thousands of variants associated with a range of phenotypes, from biometric traits to complex immune diseases. Despite these “successes”, it has been a major challenge to translate the associated SNVs into molecular mechanisms. Because the vast majority of disease-associated variants fall outside of the protein-coding sequence –– something conceptually as simple as assigning disease variants to their target genes has been a major challenge for geneticists.
To overcome this problem, different approaches have been taken: tentative identification of a candidate gene on the basis of functional relevance to disease biology, reporting of the nearest gene to the variant or claiming a gene for which the same variant affects gene expression [i.e. an expression quantitative trait locus (eQTL)]. All of these approaches, however, lack a direct link between the associated SNV and target gene. Authors [see attached full-length paper] generated a high-resolution map of enhancer–promoter interactions in rare disease-relevant cell types, thus mapping physical interactions between regulatory elements containing variants associated with autoimmune and cardiovascular diseases and target genes.
Gene expression programs are intimately linked to the hierarchical organization of the genome. In mammalian cells, each chromosome is organized into hundreds of megabase-sized topologically associated domains (TADs), which are conserved from early stem cells to differentiated cell-types. Within this invariant TAD scaffold, cell-type-specific enhancer–promoter interactions establish regulatory gene expression programs. Standard methods require tens of millions of cells to obtain high-resolution interaction maps and confidently to assign enhancer–promoter contacts. Hence, the principles that govern enhancer–promoter conformation in disease-relevant patient samples –– are not well understood. This gap in understanding is particularly problematic for interpreting molecular functions of inherited risk factors for common human diseases, which reside in intergenic enhancers or other noncoding DNA features in (as many as) 90% of cases.
Such disease-relevant enhancers may not influence expression of the nearest gene (often reported as the default target in the GWAS literature) and may instead act in a cell-type-specific manner on distant target genes residing up to hundreds of kilobases away. Recently, systematic perturbations of regulatory elements in select gene loci have shown that effects of individual regulatory elements on gene activity can be predicted from the combination of [a] enhancer activity [marked by histone H3 lysine 27 acetylation (H3K27ac) levels] and [b] enhancer–target looping. In the attached report, authors leverage this insight to capture the combination of these two types of information across the genome in a single assay –– mapping the enhancer connectome in disease-relevant primary human cells.
Authors [see attached] show that H3K27ac HiChIP [a protein-centric chromatin conformation method, which improves the yield of conformation-informative reads by more than 10-fold and lowers the input requirement more than 100-fold relative to other ChIP methods] generates high-resolution contact maps of active enhancers and target genes in rare primary human T-cell subtypes and coronary artery smooth muscle cells. Differentiation of naive T cells into T-helper-17 cells or regulatory T cells creates subtype-specific enhancer–promoter interactions –– specifically at regions of shared DNA accessibility. These findings provide a principled means of assigning molecular functions to autoimmune and cardiovascular disease risk variants, linking hundreds of noncoding single-nucleotide variants (SNVs) to putative gene targets. Target genes identified with HiChIP are further supported by CRISPR interference and activation at linked enhancers –– by the presence of expression quantitative trait loci (eQTLs), and by allele-specific enhancer loops in patient-derived primary cells. The majority of disease-associated enhancers contact genes beyond the nearest gene in the linear genome, leading to a 4-fold increase in number of potential target genes for autoimmune and cardiovascular diseases.
Nat Genet Nov 2o17; 49: 1602–1612 [full article] + pp. 1564-5 [News’N’Views editorial]