Genome-wide association studies (GWAS) are being used to identify possible drug targets that might help a particular human complex disease; there are also many labs attempting to use GWAS as a form of “predicting” one’s risk of a complex disease, or of predicting efficacy vs toxicity of a drug. The earliest GWAS was published in 2oo2, in which an association between variant sites (SNPs) in or near the LTA gene and myocardial infarction was determined. The next GWAS reported an association between the CFH gene and age-related macular degeneration in 2oo5. The GWAS field has exploded exponentially in recent years, with more than >24,000 SNP-trait associations reported in >2,500 studies [https://www.ebi.ac.uk/gwas/]. These robust GWAS –– having P-values ranging from <10−8 to <10−400 –– underscore the value of using more stringent statistical significance levels when one is studying >1 million SNPs in large cohorts containing thousands, or even hundreds of thousands, of samples. GWAS are far more reliable for genotype-phenotype association tests –– when compared with studies involving one or several SNPs in small cohorts of several dozen, or even several hundred, individuals. These latter publications of (type-I and type-II error) artifacts have variously been called “the incidentalome” and “the P <0.05 false-positive studies”.
It is well known that many parameters –– effect-size, allelic frequency, significance level, sample size –– will all affect statistical power for any GWAS. Statistical power obviously improves with larger numbers of cases and controls. As any minor allele frequency (MAF) increases, fewer subjects will usually be needed in the study group, and the level of detectable contribution by a genetic variant to a phenotype will be lower. If the MAF is low, greater numbers per group will be required, and the level of detectable contribution by a variant to a phenotype will be higher.
However, a provocative analysis [see attached article and editorial] now calls the future of that strategy into question — and raises doubts about whether funders should pour more money into large GWAS experiments. GWAS are fast expanding to encompass hundreds of thousands, even millions, of patients. Biologists are finding that the larger the cohort, the more genetic variants — or ‘hits’ — are found, each having minuscule influences on the trait being studied. It is appearing increasingly likely that common illnesses will linked by GWAS to hundreds of thousands of DNA variants: potentially, to every single DNA region that happens to be active in a tissue involved in that disorder or phenotype being studied. The same will be true for studying drug response (efficacy or toxicity).
Authors [see attached articles] suggest that many GWAS hits might not have any specific biological relevance to the trait (disease, drug response) and would not serve as good drug targets. Rather, these ‘peripheral’ variants probably act through complex biochemical regulatory networks to influence the activity of a few ‘core’ genes that are more directly connected to an illness. Many of us geneticists have considered that this view could be correct — because scientists simply do not understand, or comprehend yet, the importance of biochemical networks. Rather than more and bigger GWAS –– researchers and funders should devote their efforts to mapping regulatory networks in cells. Biologists who aim to link genes with diseases should focus on identifying the mutation(s) that directly cause(s) that specific disorder.
Cell 2o17; 169: 1177–1186 plus Nature 22 Jun 2o17; 546: 463 [editorial]