The experimental design of genome-wide association studies (GWAS) is now 10 years old, and Peter Visscher and coworkers [attached] review the remarkable range of discoveries that it has facilitated in: population and complex-trait genetics, the biology of diseases, and translation toward new possible therapeutics. Authors predict the likely discoveries in the next 10 years, when GWAS will be based on [a] millions of samples with array data imputed to a large fully sequenced reference panel and on [b] hundreds of thousands of samples with whole-genome sequencing (WGS) data. This elegant review/update is very optimistic, whereas another article (to be shared next with all of GEITP) tends to be more circumspect and NOT so optimastic.
Authors [attached] review the remarkable range of discoveries that GWAS have facilitated in population genetics, complex-trait genetics, the biology of diseases, and translation toward new therapeutics. As often mentioned in our GEITP series, so many interindividual responses to a drug or an environmental toxicant represent multifactorial traits –– each of which involve hundreds if not thousands of genes, plus epigenetic effects, plus environmental factors. Authors review general conclusions that can be drawn from GWAS discoveries across a wide range of traits –– reviewing the latest progress in three prototypic diseases (namely, type-2 diabetes (T2D), auto-immune diseases, and schizophrenia. Authors end their review with a number of sections on the limitations of current experimental designs and possible ways to overcome these, plus a prediction on the future of GWAS for human traits.
GWAS have led to a remarkable range of discoveries in human genetics over the last decade. GWAS have delivered on their original aim of detecting associations between common DNA variants and human disease and disorders. It has led to a better understanding of the genetic architecture of complex traits and, therefore, of past natural selection on traits associated with fitness. GWAS have led to the discovery of variants, genes, and biological pathways that play a role in specific diseases, disorders, and even other multifactorial traits such as height, IQ, and body mass index (BMI). GWAS have led to new discoveries in disease epidemiology and to the discovery, or repurposing, of candidate therapeutics. As foreshadowed in 2007, GWAS have indeed been a case of trying to drink water from a fire hose.
The power calculations (see Appendix A, page 16) illustrate very clearly the trade-off between sample size, allele frequency, and effect-size. In the future, when GWAS are likely to be performed by WGS rather than SNP-chips, the r2 value and the Rimp2 value in Equations A4 and A5 will be 1.0 –– if the causal variant is sequenced without error. Hence, considerable power can be gained to detect association between a trait and a sequenced variant, compared with having array-based genotyped or imputed data, if the the r2 value or the Rimp2 value is small-to-modest, but only when the experimental sample size is kept constant.
Equation A5 also demonstrates that the power to detect an association for a rare variant is limited, because of its low allele frequency, even if the effect-size is larger than that of a common variant. This means that –– for rare-variant associations –– the sample sizes will need to be very large, or at least comparable to those used for GWAS with common variants; this will also be true, even with WGS data. Please note especially the authors’ Ref. 145 is the statistical paper that started it all, and for more than the past 10 years, everyone performing GWAS appreciates the need to attain P-values of <5.0e–08 (also often expressed as <5.0 x 10–8). Am J Hum Genet July 2017; 101: 5–22