Genome-wide polygenic scores for common diseases identify individuals with risk eequivalent to monogenic mutations

Genome-wide association studies (GWAS) began more than 12 years ago, with the hope that clinical geneticists would soon be able to [a] predict genetic risk of complex diseases, and [b] find new pathways for which new drugs might be developed (to prevent or treat that complex disease). In these past 12 years, GWAS findings have certainly identified new pathways for drug development. However, unexpectedly (for many investigators), prediction of risk has been found to be virtually impossible.

Hundreds of discovered single-nucleotide variant (SNV) associations have been found to contribute only 10% to 25% of the heritability for that particular complex disease. Looking at the “simple” quantitative trait of height, for example, a 2o14 study used genome-wide data from N=253,000 individuals, identified 697 variants having genome-wide statistical significance –– and all SNVs together could explain only ~20% of the heritability for adult height. It has been suggested that we could study the entire world population of 7.7 billion, as a cohort for height, and all significant SNVs, combined, would still not total 100% of heritability for that trait.

These GEITP pages have recently discussed (once on Sept 30th, another on Nov 5th) a new approach to genetic risk of a complex phenotype, which is called “genome-wide polygenic score” (GPS). Authors [see attached article & editorial] demonstrate further that GWAS-informed whole-genome profiling can quantify individual disease risks in clinically significant ways –– potentially leading to “the long-awaited use of genetic profiling” in routine healthcare practices.

The [attached] study calculated GPSs for five complex diseases [coronary artery disease (CAD); atrial fibrillation (AFIB); type-2 diabetes (T2D); inflammatory bowel disease (IBD); and breast cancer (BrCA)] in >300,000 individuals. The GPSs were created, using sophisticated algorithms to integrate per-variant estimates of disease risk, gleaned from GWAS. Disease classification accuracy was carefully validated across two cohorts. Surprisingly, authors found that, for many individuals, GPS-associated risks were as high as those correlated with SNVs influencing rare monogenic forms of disease already routinely considered in clinical settings.

Authors further suggest that preventive interventions (e.g. lifestyle changes, and statin use) could be deployed, or suggested, to high-risk individuals, right now, on the basis of their GPSs. In addition, they found that ~20% of all individuals studied had >3-fold risk for at least one of the five diseases, and that the absolute number of individuals deemed at high risk from the GPS for CAD was 20-fold greater than the number expected to be identified via established monogenic variant screens. Authors proposed that it is time to contemplate the inclusion of polygenic risk prediction in clinical care, and to talk about relevant issues pertaining to GPS for many complex genetic diseases.

Nat Genet Oct 2o18; 50: 1219–1224 [article] & pp 1210–1211 [News’N’Views]

COMMENT: Dan, Just as a comment on this GEITP email (which emails I continue to enjoy receiving): long before GWAS, we knew we could not predict complex trait risks explicitly from genetics, because monozygotic (identical) twin pairs typically are only concordant (presence of the same trait in both members of a pair of twins) 50% or less of the time. In our own area (cleft lip and palate), concordance is about 50%, compared with ~5% for dizygotic (fraternal) twins. So, we have always known that it is “not all genetics.” Or indeed “not all even environmental effects” –– unless the environment in the uterus is so variable, thereby implying stochastic (randomly determined) effects are important.)

The Genome-wide Prediction Score (GPS) method is nice, because it provides some degree of “risk assessment,” and we need to provide our patients with enough of a “mathematical appreciation,” so that they might understand what their GPS means for any particular disease. However, exact predictions are fictional –– which still does not diminish the use of GWAS, etc. for druggable target discovery, new biology, and even some degree of risk determination.

COMMENT: COMMENT: Dear Dr. Nebert, I would say that this publication is an “important article.” It has been highlighted in many news reports and scientific conferences –– including last month’s Amercan Society of Human Genetics meeting in San Diego. I think one of the major reasons is that this study showed (some) translational value (‘translational’ = multi-disciplinary, highly collaborative, ‘bench-to-bedside’ approach) and clinical usefulness of large-scale genomic studies. The research community and the general public “need” this type of “good news,” to fuel their hopes of “personalized medicine.”

I agree that the paper showed some value of genomic prediction using polygenic scores; however, it is mainly a “numbers game.” If you look at the overall performance –– evaluated by the area-under-the-curve (AUC) method –– the best score was ~80%, which barely reaches the bottom line for the test to be (statistically) useful in clinical predictive assessment. This is why the authors emphasized only the extreme cases (i.e. those with only very high genetic risk), when they compared those predictive values to those having “monogenic risk.”

In my humble opinion, this is a study that has had a large impact (on the field of clinical genomics) –– much more so than any “real value.” For example, if you take a close look at Supplementary Table 9 [pasted below], their data on five complex diseases suggest that this GPS Model can explain only a very small fraction of the total variance.

This entry was posted in Center for Environmental Genetics. Bookmark the permalink.