A Mammalian DNA Methylation Landscape

As this GEITP group has discussed numerous times, each person’s overall genetic architecture (landscape) represents the combination of differences in our: [a] genetics (e.g., DNA sequence changes); [b] epigenetics (chromosomal but not involving DNA sequence); [c] environmental effects (cigarette smoking, occupation); [d] endogenous influences (cardiovascular or renal disease); and [e] each individual’s microbiome. [Except for stem cell DNA sequence differences, the other four categories continue to be subject to change in all cell-types throughout one’s lifetime.]

Epigenetic processes include DNA-methylation, RNA-interference (RNAi), histone-modifications, and chromatin-remodeling.

Genome-wide association studies (GWASs) are used to determine associations of individual DNA nucleotide changes (single-nucleotide variants, SNVs) with phenotypes (traits such as e.g., height, longevity, schizophrenia or risk of autism spectrum disorder, or risk of lung cancer from cigarette smoking, or cancer from asbestos exposure). In contrast to looking at the genome (DNA nucleotide sequence) in a GWAS, the attached article and editorial examines “the methylome” (DNA sites that are methylated vs not methylated — comparing 348 mammalian species simultaneously). Or subsets of the entire methylome… By the time you finish reading this, you probably will learn more about DNA-methylation than you really cared to know. 😉

“Life span” is an example of a phenotype. Mammals vary greatly in life span (e.g., the bowhead whale can live up to 200 years, whereas the giant Sunda rat lives only about 6 months). This disparity is encoded in the genomes of each species; however, which genes are linked to these traits is still poorly understood. Because all mammals have (approximately) the same genes, variation in how these genes are regulated should be important in determining longevity.

Authors (see attached; I count more than 200 coauthors!!) present a large-scale study of DNA methylation (more methyl groups usually results in down-regulation of gene expression; fewer methyl groups generally means up-regulation of gene expression) in a diverse range of mammalian species. Authors identified genomic regions that (e.g.,) might control life-span variation among lineages, which could help uncover the molecular drivers of life span and other traits in mammals.

DNA methylation is a chemical modification (addition of CH3- group) that almost always occurs in cytosines that are followed by a guanine (CpGs) in mammalian genomes. DNA methylation information is inherited after mitosis; however, it is constantly changing during development or among tissues — and over the lifetime of every organism. DNA methylation differences occur mostly at “enhancers” [i.e., stretches of DNA that dictate the expression of a nearby gene(s)]. Thus, each cell-type and tissue in the body has a precise DNA methylation signature (like a barcode).

Although DNA methylation is frequently not the main factor that dictates gene regulation, it is a robust biomarker for gene activity and cell identity. [To make things more mind-boggling, in the human body, there are about 208-212 estimated cell-types; consequently, the number of “DNA methylomes” each of us has would be 208-212 epigenomes.]

DNA methylation is easier to measure than other classic (epigenetic) gene regulatory mechanisms, such as histone modifications or transcription factors. However, reliable quantification of DNA methylation across the genome is not trivial because current gold-standard methods (such as whole-genome bisulfite sequencing) require a reference genome and large amounts of data. This makes studying DNA methylation across a large number of samples difficult, and large sample sizes are required to find significant associations between DNA methylation and complex traits such as life span or body weight. This limitation can be overcome by using microarrays to probe for specific subsets of CpGs. Such microarrays have previously been used for studies in humans and mice.

Authors [see attached] used a recently designed pan-mammalian DNA methylation microarray that captures a subset of the CpGs that are conserved across all (available) mammals — including marsupials and egg-laying mammals, at high confidence and for a fraction of the cost of other methods. Such a microarray does not need a reference genome, and the CpG islands are directly comparable across samples and species. Authors profiled the DNA methylation of 15,456 samples from 348 species, including up to 70 tissues/cell-types per species. They used data from blood (a tissue comparable across all species) to obtain species relationships solely on the basis of DNA methylation. This clustering largely recapitulated the mammalian Tree of Life (which indicates that phylogeny and species relatedness is a major factor that underlies variation in DNA methylation).

To disentangle the variation in DNA methylation explained by phylogeny from that explained by other traits such as age or tissue of origin, authors performed unsupervised clustering of all the CpGs (in all species and all tissues studied) according to their co-variation. CpGs that gained or lost methylation in a coordinated manner across many samples were grouped together into modules. Authors then looked for associations between these modules and a range of features (including

species traits such as taxonomy or life span and individual traits such as age, sex, or body size). As expected, many CpG modules had methylation patterns that were specific to a taxonomic group. However, other modules included groups of CpGs whose methylation status was enough to discriminate the organ or sex of the sample, regardless of species.

Several CpG modules were associated with life span. Variation in DNA methylation in these genomic regions explained, to some extent, differences in life span across species. This finding is linked to the discovery that, as humans and mice age, DNA methylation changes in many genomic regions. This has allowed the construction of so-called “epigenetic clocks,” which are mathematical models that enable the prediction of biological age on the basis of methylation status of specific CpGs. Because the relative onset of aging could be a major factor in determining species maximum life span, identifying CpG modules that are linked to cross-species variation in life span might identify gene-regulatory events that are responsible for differential aging processes in mammals.

Among the genomic regions that were associated with life-span variation, some were predicted to be regulated by transcription factors (TFs) important for pluripotency (i.e., the ability to differentiate into any cell-type). These pluripotency factors encode proteins, such as octamer-binding protein-4 (OCT4) or SRY-box transcription factor-2 (SOX2), whose expression can revert an adult differentiated cell to an embryonic-like cell. OCT4 and SOX2 belong to a group of transcription factors known as the Yamanaka factors, the experimental reactivation of which decreases markers of aging in mice. Authors found that experimental re-expression of the Yamanaka factors in adult mice affected the methylation status of some CpG modules associated with lifespan variation. Therefore, regulation of these factors across the life of mammals might drive different life spans — with some species expressing them for longer.

In conclusion, this amazing study shows that DNA methylation can be a powerful biomarker across mammals. Furthermore, this study is an example of one of the directions as to where the field of evolutionary genomics is headed…!! 😊😊

DwN

Science 11 Aug 2023; 381: eabq5693 (text 15 pages) + editorial, pp 602-603

This entry was posted in Center for Environmental Genetics. Bookmark the permalink.