Polygenic Risk Scores for Prediction of Breast Cancer and Breast Cancer Subtypes

As with the previous GEITP email, the topic of this study [see attached report; note there are a bazillion coauthors on this publication J] is again about polygenic risk score (PRS) — which is the latest advance/improvement on genome-wide association studies (GWAS). PRS is the “new kid on the block.”

As these GEITP pages have stated previously, breast cancer is a multifactorial trait [i.e. reflecting the contribution of hundreds if not thousands of genes, plus epigenetic factors, plus endogenous influences (including age of onset of breast development), perhaps environmental effects (diet, smoking, occupation), and perhaps even one’s microbiome]. Many GWAS by large consortia have been published — resulting in ~170 potential genes identified statistically, but the total heritability (variance revealed) is only ~40%. Thus, the numerous common-breast-cancer-susceptibility variants discovered via GWAS confer individually a small risk; however, their combined effect, when summarized as a PRS, can be substantial.

Such genomic profiles can be used to stratify women, according to their risk of developing breast cancer. This, in turn, holds the promise of improved breast cancer prevention and survival — by targeted screening or other preventive strategies in those women most likely to benefit. A 2015 study had derived a PRS, based on 77 established breast-cancer-susceptibility single-nucleotide variants (SNVs) and reported levels of risk stratification achieved by this PRS. Empirical validation and characterization of the PRS in large-scale epidemiological studies has, however, not been carried out previously. In addition, more informative PRSs would improve the clinical utility of risk prediction.

The aim of the present study [see attached] was to develop individual PRSs, optimized for prediction of estrogen receptor (ER)-specific disease — from the largest available GWAS dataset, and to empirically validate the PRSs in prospective studies. The development dataset was composed of 94,075 cases and 75,017 controls of European ancestry from 69 studies. Samples were genotyped via GWAS, and SNVs were selected by stepwise regression and/or least absolute shrinkage and selection operator (LASSO) [this performs both variable selection and regularization in order to enhance the prediction accuracy and interpretability of the statistical model it produces]. The best performing PRSs were then validated in an independent test set comprising 11,428 cases and 18,323 controls from 10 prospective studies — as well as 190,040 women from the UK Biobank (3,215 incident breast cancers).

For the best PRSs (313 SNVs), the odds ratio for overall disease in ten prospective studies was 1.61 (i.e. 61% greater than by chance alone), and the lifetime risk of overall breast cancer in the top one-tenth of the PRSs was 32.6%. Compared with women in the middle quintile (‘quintile’ = each of five segments in a population), those in the highest 1% of risk had 4.37-fold and 2.78-fold risks, and those in the lowest 1% of risk had 0.16-fold and 0.27-fold risks, of developing ER-positive and ER-negative disease, respectively. Authors conclude that the PRS is a powerful and reliable predictor of breast cancer risk that may improve breast cancer prevention programs.


Am J Hum Genet 3 Jan 2o19; 104: 21–34

This entry was posted in Center for Environmental Genetics. Bookmark the permalink.