Leveraging Polygenic Functional Enrichment to Improve GWAS Power

Genome-wide association studies (GWAS) represent the prevailing approach for identifying risk loci for common diseases and complex traits — such as schizophrenia, obesity, type-2 diabetes, drug efficacy or response to environmental toxicants. In the study design of GWAS, millions of single-nucleotide variants (SNVs) are assayed in a large cohort of individuals (e.g. N=50,000 or 500,000) and marginally tested for association with the trait chosen to be investigated. To safeguard against false-positive (Type I error) associations, practitioners must impose stringent P-value thresholds, which can limit power. Consequently, only a small fraction of total SNV-associated heritability is explained by these variants that are significant at genome-wide thresholds (e.g. P <5.0 x 10–8; also written as 5.0e–08). For a fixed GWAS sample size, the statistical power to detect significant associations is determined by effect-size, minor allele frequency (MAF), and levels of linkage disequilibrium [LD; the non-random association of alleles (each gene has two alleles, one on each strand of chromosome pair) at different loci in a given population] at causal and non-causal variants. These three parameters interact in non-trivial ways in the context of complex traits, as well as quantitative traits (e.g. height, body mass index, I.Q.). For example, it has been reported that after adjusting for MAF, SNVs having lower levels of LD (i.e. decreased non-random association) have larger causal effects. These observations are motivating the development of new strategies that leverage polygenic (i.e. many genes) signals to improve GWAS power. Emerging functional genomics data have revealed that certain categories of variants are enriched for disease heritability. Thus, incorporating functional information into association analyses has the potential to increase GWAS power. However, previous integrative methods for GWAS hypothesis-testing either assume sparse genetic architectures [the underlying genetic basis of a phenotypic trait and its variational properties. Phenotypic variation for quantitative traits is the result of the segregation of alleles at quantitative trait loci (QTL)] when estimating functional enrichment, or requiring knowledge or approximation of the true effect-size distribution, or are not producing P-values for each SNV as output. In addition, general-purpose methodologies for association-testing that can integrate prior information — have not yet been thoroughly evaluated in the context of GWAS-leveraging functional genomics data. Authors [see attached article] propose an approach that uses polygenic modeling to weight SNVs — according to how well they identify functional categories that are enriched for heritability. Their procedure takes, as input summary association statistics, along with pre-specified functional annotations (which can be overlapping and/or continuously valued), and outputs well-calibrated P-values. Authors use a broad set of 75 coding, conserved, regulatory, and LD-related annotations that have previously been shown to be enriched for disease heritability. Then authors incorporate the weights computed (by a weighted-Bonferroni procedure that we won't go into). Through extensive simulations and analysis of UK Biobank phenotypes, authors demonstrate that their approach [called functionally informed novel discovery of risk loci (FINDOR)] reproducibly identified an additional 583 GWAS loci (a 13% increase in genome-wide significant loci detected — including a 20% increase for disease traits) while, at the same time, controlling for false positives. Authors conclude that "leveraging functional enrichment", using their FINDOR method, was able to robustly increase GWAS power. These GEITP pages have discussed before this polygenic functional enrichment of GWAS data to enhance identification of relevant SNVs in multifactorial phenotypes, and we expect to see this approach, eventually, in studies of drug efficacy, risk of toxicity caused by environmental agents, etc. DwN Am J Hum Genet 3 Jan 2o19; 104: 65–75

This entry was posted in Center for Environmental Genetics. Bookmark the permalink.