This topic involves heavy population genetics and genomics, so bear with me. Genome-wide association studies (GWAS) first began in ~2005, and the number of GWAS has exploded ever since. Fundamentally, researchers choose a phenotype (trait) and then sequence the genomes of thousands of subjects having the trait, and compare those data with genomes of thousands of controls without the trait. Usually, the phenotype is a gradient (e.g. height, body mass index, IQ) in which a quantitative trait exhibits a distribution (often a bell-shaped curve) between two extremes.
GWAS have been performed for studies of genetic differences in drug response, and in response to environmental toxicants — as well as studies of single-nucleotide variants (SNVs) associated with complex diseases (multifactorial traits such as obesity, heart disease, cancer, schizophrenia). Consequently, increasingly large sample sizes have led to discovery of thousands of SNVs associated with individual traits, including complex diseases and risk factors for disease. Analyses of polygenicity of a variety of traits have further indicated that many individual traits are likely to be associated with thousands, to tens of thousands, of SNVs (located in and near genes) — each having a very small-effect contribution on the phenotype.
Thus, in the latest advancement in this field of population genomics, much attention has been paid to the usefulness of polygenic risk scores (PRS) — which represent the genetic burden of a given trait; the long-term (highly optimistic) plan is to develop strategies for risk-based intervention through lifestyle modification, screening, and drug therapy. A PRS for a given trait is typically defined as “a weighted sum of a set of germline SNVs, in which the weight for each SNV corresponds to an estimate of the strength of association between the SNV and the trait.”
Recent studies indicate that, whereas PRS tend to have “modest predictive capacity overall”, they have the potential to offer “substantial stratification of a population” into distinct levels of risk for some common diseases (e.g. coronary artery disease, autism spectrum disorder, breast cancer). Clearly there is ongoing debate regarding the utility of PRS in clinical practice. However, PRS can be more robust, and with more cost-efficient tools for risk stratification, than other biomarkers and risk factors; in particular, PRS do not change over time (they measure germline SNVs and thus need to be measured only once). In addition, the risk associated with PRS for different traits — appears in many cases to be fairly consistent over an individual’s life course, and time-varying lifestyle and clinical factors tend to act in a multiplicative way on baseline genetic risk. Furthermore, if genome-wide genotype and/or sequencing data are available on an individual, those same data can be used to evaluate the PRS for a large number of traits simultaneously. Thus, beyond the use of PRS for prevention of specific diseases, it is important to evaluate their utility for broad health outcomes, particularly if PRS are to be used in routine health care.
The two studies [see attached] both deal with PRS; the first one describes SNVs that contribute to “all-cause mortality” and the second one concerns “colorectal cancer.”
While a small number of SNVs associated with lifespan have been identified, no study to date has systematically evaluated the ability of emerging PRS for life-threatening diseases and mortality risk factors to predict mortality. Using data from the UK Biobank to combine PRS for 13 diseases and 12 mortality risk factors into sex-specific composite PRS (cPRS), authors [see first paper] estimated differences in life expectancy — between the top and bottom 5% of the cPRS — to ~4.79 years and ~6.75 years for women and men, respectively. These associations were substantially attenuated, after adjusting for non-genetic mortality risk factors measured at study entry (i.e. middle-age for most participants). Authors naïvely suggest that “the cPRS may be useful in counseling younger individuals at higher genetic risk of mortality on modification of non-genetic factors.” [Hah. ☹]
Accurate colorectal cancer (CRC) risk prediction models are critical for identifying individuals at low versus high risk of developing CRC, because they can then be offered targeted screening and interventions to address risks of developing disease (if they are in a high-risk group) versus avoid unnecessary screening and interventions (if they are in a low-risk group). Authors [see second paper] compared 55,105 CRC-affected patients with 65,079 control subjects of European ancestry. Their PRS was built in three ways — using [a] 140 previously identified and validated CRC loci; [b] SNV selection based on linkage disequilibrium (LD = the non-random association of alleles at different loci in a given population) clumping, followed by machine-learning approaches; and [c] LDpred (a Bayesian approach for genome-wide risk prediction). Authors also tested the PRS in an independent cohort of 101,987 individuals with 1,699 CRC-affected patients. The discriminatory accuracy — calculated by the age- and sex-adjusted area-under-the-(receiver operating characteristics)-curve (AUC), was highest for the LDpred-derived PRS (AUC ~0.654) including nearly 1.2 million SNVs (the proportion of causal genetic SNVs for CRC assumed to be 0.003), whereas the PRS of the 140 known SNVs identified from GWAS had the lowest AUC (~0.629).
Based on the LDpred-derived PRS, authors were able to identify 30% of individuals without a family history (i.e. having risk for CRC similar to those with a family history of CRC) — whereas the PRS, based on known GWAS SNVs identified only the top 10% as having a similar relative risk; therefore, ~90% of these individuals have no family history and would have been considered “average risk” under current screening guidelines, but might benefit from earlier screening…!! This breakthrough study demonstrates to these GEITP pages that PRS might actually offer a valuable method for risk-stratified CRC screening. Might other targeted interventions be demonstrated soon…?? 😊
Am J Hum Genet Sept 2020; 107: 418-431 & 432-444