One of the Most Widely Used Methods in Epigenetics Can Cause Misleading Results

This BREAKTHROUGH paper (reported today in FRONTLINE GENOMICS) is very important to those in the field (e.g. Sean Zhang), so I have moved it “to the front of the QUEUE.” There is nothing worse than spending a lot of time, and publishing data, when it turns out (in the end) to be an artifact. DwN One of the Most Widely Used Methods in Epigenetics Can Cause Misleading Results

An error in one of the most widely used methods in epigenetics, DIP-seq, can cause misleading results, researchers have shown. This may have major significance in the research field, where big data and advanced methods of DNA analysis are used to study vast amounts of epigenetic data. The error can be corrected in previously collected DIP-seq data, which may lead to new discoveries from previous studies of human epigenetics. The results are described in Nature Methods.

In principle, every cell in our body has the same DNA sequence. However, different cell types use very different groups of genes. This means that additional signals are required to control which genes are used in each individual cell type. This means, with about 210 cell types in the human body, each of us has ~210 epigenomes.

One type of such signal consists of chemical groups directly attached to the DNA sequence. These chemical modifications of the DNA sequence form part of what is commonly called the epigenetic code. Epigenetic regulation of genes plays an important role in normal human development but is also associated with many diseases, such as cancer.

Researchers at Linköping University have now discovered a weakness in one of the most frequently used methods in epigenetic research, DNA immunoprecipitation sequencing (DIP-seq). Put simply, this method is based on picking out the parts of the DNA that carry a particular epigenetic signal. For this, the researchers use various antibodies that recognise a specific chemical structure and bind to it. The antibodies are subsequently sorted, and the sequences of the DNA that they have bound to, are determined. Nestor’s group noticed that certain epigenetic marks always occurred in the same place, even in DNA that shouldn’t contain those epigenetic marks at all.

“Our discovery highlights the importance of experimental validation when using high-throughput technologies in research,” says Colm Nestor, Assistant Professor at the Department of Clinical and Experimental Medicine and lead investigator of the study. “Without such experimental rigour, pervasive errors can hide in plain sight, concealed by their ‘consistency’ across studies”. By analysing more than 125 existing datasets the researchers revealed that DIP-seq commonly detected DNA sequences that did not have any epigenetic marks.

These false positives constitute 50-90% of the detected DNA regions, and the magnitude of the effect differs between different datasets. “Now that we know about this error, it’s extremely simple to subtract it away. Correcting for these errors will allow novel discoveries to be made from the wealth of epigenetics data already in the public domain,” Nestor continued.

The researchers point out that the vast majority of results from previous studies are correct. “We should continue to use these methods –– but correct for these errors by using appropriate experimental design,” Nestor concluded.

Posted in Center for Environmental Genetics | Comments Off on One of the Most Widely Used Methods in Epigenetics Can Cause Misleading Results

Moving on into a New Frontier: “3-dimensional genomics” (the way the chromosomes are situated in the nucleus)

As these GEITP pages keep harping on –– the factors that contribute to any trait (response to a drug or environmental toxicant, hair color, height, blood pressure, risk of type-2 diabetes, risk of autism spectrum disorder, etc.) include: genotype (DNA sequence); epigenetic effects (chromosomal but not DNA sequence changes); endogenous influences (renal blood flow, age, cardiovascular status, etc.); environmental factors (cigarette smoking, drug exposures, occupation, etc.); and even one’s individual microbiome differences (gut bacteria contributing metabolites). Now there appears to be a new kid on the block: “3D-genomics.”

It is becoming possible (technically) to evaluate how chromatin is organized in the 3-dimensional (3D) nuclear space (i.e. how the chromosomes are configured/oriented, relative to one another) –– which will help us understand various DNA-templated processes such as transcription, replication, and DNA recombination. This field has recently transitioned from microscopy-based approaches limited in resolution and throughput to a powerful combination of genomics, microscopy and computational technologies. Thus, the genomicist [see attached editorial] is now able to paint a high-resolution multi-scale picture of chromatin architecture. The breakthrough was spearheaded 16 years ago by development of the chromosome conformation capture (3C) method [Science 2oo2; 295, 1306]. Scientists then began to realize that chromosomes normally fold into active and inactive compartments, which are partitioned into smaller topologically-associating domains (TADs) and chromatin loops.

Intriguingly, genome topology is considered to have important roles in the regulation of gene expression, DNA replication, X-chromosome inactivation, adaptive immunity, and cell-fate decisions. Recent studies have shown that abnormal chromatin folding can lead to disease, including developmental disorders and cancer. Data generation, however, is still challenging and costly. Furthermore, it remains difficult to predict how complex changes to mammalian genomes (e.g. the large structural alterations often present in human disorders) impact chromatin architecture and subsequent gene regulatory processes. For those interested in reading further, please check out Nat Genet (May 2o18) 50: page 631 & pp 662–667. These articles introduce new methodologies that promise to further facilitate the generation, and functional interpretation, of genome-wide chromatin interaction maps.

DwN

Nature Genetics May 2o18; 50: pp 634–635 [Editorial]

Posted in Center for Environmental Genetics | Comments Off on Moving on into a New Frontier: “3-dimensional genomics” (the way the chromosomes are situated in the nucleus)

New human gene (increased) tally reignites debate

This is a recent editorial in Nature.

DwN

New human gene (increased) tally reignites debate

Some fifteen years after the human genome was sequenced, researchers still can’t agree on how many genes it contains

––––Cassandra Willyard

One of the earliest attempts to estimate the number of genes in the human genome involved tipsy geneticists, a bar in Cold Spring Harbor (NY), and pure guesswork. That was in 2000, when a draft human genome sequence was still in the works; geneticists were running a sweepstake on how many genes humans have, and wagers ranged from tens of thousands to hundreds of thousands of genes. Almost two decades later, scientists armed with real data still can’t agree on the number — a knowledge gap that they say hampers efforts to spot disease-related mutations.

The latest attempt to plug that gap uses data from hundreds of human tissue samples and was posted on the BioRxiv preprint server on 29 May. It includes almost 5,000 genes that had not previously been spotted — among them nearly 1,200 that carry instructions for making proteins. And the overall tally of more than 21,000 protein-coding genes is a substantial jump from previous estimates, which put the figure at around 20,000.

But many geneticists aren’t yet convinced that all the newly proposed genes will stand up to close scrutiny. Their criticisms underscore just how difficult it is to identify new (unequivocal protein-coding) genes, or even define what a gene is. “People have been working hard at this for 20 years, and we still don’t have the answer,” says Steven Salzberg, a computational biologist at Johns Hopkins University in Baltimore, Maryland, whose team produced the latest count.

Hard to pin down

In 2000, with the genomics community abuzz over the question of how many human genes would be found, Ewan Birney launched the GeneSweep contest. Birney, now co-director of the European Bioinformatics Institute (EBI) in Hinxton, UK, took the first bets at a bar during an annual genetics meeting, and the contest eventually attracted more than 1,000 entries and a US$3,000 jackpot. Bets on the number of genes ranged from more than 312,000 to just under 26,000, with an average of around 40,000. These days, the span of estimates has shrunk — with most now between 19,000 and 22,000 — but there is still disagreement (See ‘Gene Tally’ box).

https://media.nature.com/w800/magazine-assets/d41586-018-05462-w/d41586-018-05462-w_15856180.png

Source: M. Pertea & S. L. Salzberg

The gene count can vary –– depending on the data being analysed, the tools used, and the criteria for weeding out false positives. The latest count used a larger dataset and different computational methods from previous efforts, as well as broader criteria for defining a gene. Salzberg’s team used data from the Genotype-Tissue Expression (GTEx) project, which sequenced RNA from more than 30 different tissues taken from several hundred cadavers. RNA is the intermediary between DNA and proteins. The researchers wanted to identify genes that encode a protein and those that don’t but still serve an important role in cells. So they assembled GTEx’s 900 billion tiny RNA snippets and aligned them with the human genome.

Just because a stretch of DNA is expressed as RNA, however, does not necessarily mean it’s a gene. So the team attempted to filter out noise using a variety of criteria. For example, they compared their results with genomes from other species, reasoning that sequences shared by distantly related creatures have probably been preserved by evolution because they serve a useful purpose, and so are likely to be genes.

The team was left with 21,306 protein-coding genes and 21,856 additional non-coding genes — many more than are included in the two most widely used human-gene databases. The GENCODE gene set, maintained by the EBI, includes 19,901 protein-coding genes and 15,779 non-coding genes. RefSeq, a database run by the US National Center for Biotechnology Information (NCBI), lists 20,203 protein-coding genes and 17,871 non-coding genes.

Kim Pruitt, a genome researcher at the NCBI in Bethesda, Maryland, and a former head of RefSeq, says the difference is probably due in part to the volume of data that Salzberg’s team analysed. And there’s another major difference. Both GENCODE and RefSeq rely on manual curation — a person reviews the evidence for each gene and makes a final determination. Salzberg’s group relied solely on computer programs to sift the data. “If people like our gene list, then maybe a couple years from now we’ll be the arbiter of human genes,” says Salzberg.

Tricky tally

But many scientists say they need more evidence to be convinced that the list is accurate. Adam Frankish, a computational biologist at the EBI who coordinates the manual annotation of GENCODE, says that he and his group have scanned about 100 of the protein-coding genes identified by Salzberg’s team. By their assessment, only one of those seems to be a true protein-coding gene.

And Pruitt’s team looked at about a dozen of the Salzberg group’s new protein-coding genes, but didn’t find any that would meet RefSeq’s criteria. Some overlapped with regions of the genome that seem to belong to retroviruses that invaded our ancestors’ genomes; others belong to other repetitive stretches, which are rarely translated into proteins. But Salzberg says that some repetitive sequences can be considered genes. One example is ERV3-1, which appears in RefSeq and encodes a protein that is over-expressed in colorectal cancer. Salzberg also acknowledges that the new genes on his team’s list will require validation by his team and others.

Further confounding counting efforts is the imprecise and changing definition of a gene. Biologists used to see genes as sequences that code for proteins, but then it became clear that some non-coding RNA molecules have important roles in cells. Judging which are important — and should be deemed genes — is controversial, and could explain some of the discrepancies between Salzberg’s count and others.

Still, it’s likely that at least some of the genes identified by Salzberg’s group will turn out to be valid, says Emmanouil Dermitzakis, a geneticist at the University of Geneva in Switzerland, who co-chairs the GTEx project. He isn’t surprised that the team’s count for protein-coding genes is a 5% increase on previous tallies –– given the gargantuan size of the GTEx data set.

Having an accurate tally of all human genes is important for efforts to uncover links between genes and disease. Uncounted genes are often ignored, even if they contain a disease-causing mutation, Salzberg says. But adding genes –– with great haste –– to the master list can pose risks, too, says Frankish. A gene that turns out to be incorrect can divert geneticists’ attention away from the real problem. Still, the inconsistencies in the number of genes from database to database are problematic for researchers, Pruitt says. “People want one answer,” she adds, “but biology is complex.”

Nature June 2o18; 558: 354-355

Posted in Center for Environmental Genetics | Comments Off on New human gene (increased) tally reignites debate

How did Life begin?

his topic is a bit on the periphery of ‘Gene-Environment Interactions,’ but –– without Life on Earth –– we wouldn’t have genes to interact with the environment. 🙂

How and Why did Life come to exist on this planet? Is it simple for life to emerge on many newly formed planets? Or is it the virtually impossible product of a long series of unlikely events? Advances in fields –– as unrelated as astronomy, planetary science and chemistry –– now hold promise that answers to such profound questions may be coming soon. If life turns out to have emerged multiple times in our galaxy (as scientists expect to discover), then that means the path to “development of life” cannot be so difficult. Moreover, if the route from chemistry-to-biology proves simple to traverse, the universe could be teeming with life.

The discovery of thousands of exoplanets [see attached editorial] has sparked a renaissance in origin-of-life studies. In a stunning surprise, almost all the newly discovered solar systems look very different from our own. Does that mean something about our own, very unusual, sys­tem favors the emergence of life? Detecting signs of life on a planet orbiting a distant star is not going to be easy, but the technology for teasing out subtle “biosignatures” is developing –– so rapidly that, with any luck, we may see solid evidence of distant “Life” within one or two decades.

To understand how Life might begin, we must first understand how — and with what ingredients — plan­ets normally form. A new generation of radio telescopes has provided beautiful im­ages of proto-planetary disks and maps of their chem­ical composition. This information is motivating scientists to develop better models of how planets assemble from the dust and gases of a disk. Within our own solar system, the Ro­setta mission (which had visited a comet) has helped us to appreciate early materials that formed Earth. OSIRIS-REx will soon visit an asteroid, and even try to return samples from it, which might give us the essential inventory of the materials that came together in our planet.

Once a planet, such as our Earth — not too hot and not too cold, not too dry and not too wet — has formed, what carbon-hydrogen-oxygen (CHO) chemistry must then develop to yield the Building Blocks of life? In the 1950s the Miller-Urey ex­periment (which zapped with electric pulses, a mixture of water and sim­ple chemicals, to simulate the impact of lightning) demonstrated that amino acids, the building blocks of proteins, are easy to create in a chemical flask (i.e. in vitro). Other molecules of life turned out to be harder to synthesize, however. It is now apparent that we need to completely reimagine the path from chemistry to life. The central reason hinges on the versatility of RNA, a very long molecule that plays a multitude of essential roles in all existing forms of life. RNA can not only act like an enzyme, but it can also store and transmit information. The [attached] excellent editorial provides the Reader with a concise summary.

Nature 10 May 2o18; 557: pp S13–S15

COMMENT: Hi Dan, I like the Panspermia Model. It has had a number of really smart and famous proponents –– including Fred Hoyle, Stephen Hawking, and even S. Steven Potter (ha ha)!!

So, the basic notion is that Life came from Outer Space. Microorganismal life should be able to travel in interplanetary space on dust particles, knocked into space (from the planet of origin) by meteorites or interplanetary collisions. (Actually, there are many meteorites from Mars found on the surface of Earth; at last count, 132 have been found.) Microorganismal life on dust particles can be driven away from a sun by solar wind, and then decelerated by solar wind in another solar system. Upon hitting an atmosphere, the dust particles would float to the surface like a feather, rather than burn like a rock (i.e. a meteor).

But, then, WHERE did the Life in outer space originate?? Well, it has been estimated that there are ~100 billion planets in our Milky Way galaxy. For Life to begin, you only need a one-time accident, an “original incubator planet”, to get things started. And then these microorganisms would be spread throughout the galaxy. The Panspermia Model increases your odds by 100 billion –– because Life would only be required to have started once and then spread. I like those odds.

“Getting Life started” looks to be very, very tricky. Some have compared it to “assembling a wristwatch that happened by chance”, as an argument for Divine Creation. Cells do have a lot of vital parts, and we don’t really have any good models for how you go from organic soups to dividing cells. I do like “the RNA-first” idea of how things might have started, and maybe “somewhere out there” –– ideal conditions one way or another existed for creating the first living cell. Given a hundred billion planets, there would be a lot of possible “sets of starting conditions”.

Posted in Center for Environmental Genetics | Comments Off on How did Life begin?

All Young Cannabis Users Face Higher Risk of Psychosis

The breakthrough/landmark study was published recently in J Am Med Assn Psychiatry, but this article on MedScape today is what caught my attention. Every one of us has had experience with family members, relatives, neighbors, and/or friends who have seen their teenager –– go from a ‘bright-eyed busy-tailed ambitious exuberant pre-teen’ to a ‘laid-back apathetic slacker’ by the time they are age 18 or 20. (Sadly, this personality affect appears to be irreverible.)

And, of course, there are INTER-INDIVIDUAL DIFFERENCES, i.e. the risk of “loss of mental acuity”, as well as risk of psychosis (the phenotype; the trait) depends on each person’s genetic susceptibility (genotype). Clearly some kids will be ‘more resistant to loss of mental acuity or more resistant to psychosis’ (drug toxicity) than others. And this is caused by five categories: genetic differences, epigenetic effects, endogenous influences, environmental factors, and, yes, even each person’s microbome differences.

All Young Cannabis Users Face Higher Risk of Psychosis

Pauline Anderson

June 15, 2018
Cannabis use directly increases the risk for psychosis in teens, new research suggests.

A large prospective study of teens shows that “in adolescents, cannabis use is harmful” with respect to psychosis risk, study author Patricia J. Conrod, PhD, professor of psychiatry, University of Montreal, Canada, told Medscape Medical News. The effect was observed for the entire cohort. This finding, said Conrod, means that all young cannabis users face psychosis risk, not just those with a family history of schizophrenia or a biological factor that increases their susceptibility to the effects of cannabis.

“The whole population is prone to have this risk,” she said. The study was published online June 6 in JAMA Psychiatry.

Rigorous Causality Test

Increasingly, jurisdictions across North America are moving toward cannabis legalization. In Canada, a marijuana law is set to be implemented later this year. With such changes, there is a need to understand whether cannabis use has a causal role in the development of psychiatric diseases, such as psychosis.

To date, the evidence with respect to causality has been limited, because studies typically assess psychosis symptoms at only a single follow-up and rely on analytic models that might confound intraindividual processes with initial between-person differences. Determining causality is especially important during adolescence, a period when both psychosis and cannabis use typically start.

For the study, researchers used random intercept cross-lagged panel models (RI-CLPMs), which Conrod described as “a very novel analytic strategy.”

RI-CLPMs use a multilevel approach to test for within-person differences that inform on the extent to which an individual’s increase in cannabis use precedes an increase in that individual’s psychosis symptoms, and vice versa. The approach provides the most rigorous test of causal predominance between two outcomes, said Conrod.

“One of the problems in trying to assess a causal relationship between cannabis and mental health outcomes is the ‘chicken-or-egg issue’. Is it that people who are prone to mental health problems are more attracted to cannabis, or is it something about the onset of cannabis use that influences the acceleration of psychosis symptoms?” she said.

The study included 3,720 adolescents from the Co-Venture cohort, which represents 76% of all grade 7 students attending 31 secondary schools in the greater Montreal area. For 4 years, students completed an annual Web-based survey in which they provided self-reports of past-year cannabis use and psychosis symptoms.

Such symptoms were assessed with the Adolescent Psychotic-Like Symptoms Screener; frequency of cannabis use was assessed with a six-point scale (0 indicated never, and 5 indicated every day). Survey information was confidential, and there were no consequences of reporting cannabis use.

“Once you make those guarantees, students are quite comfortable about reporting, and they become used to doing it,” said Conrod.

Marijuana Use Highly Prevalent

The first time-point occurred at a mean age of 12.8 years. Twelve months separated each assessment. In total, 86.7% and 94.4% of participants had a minimum of two time-points out of four on psychosis symptoms and cannabis use, respectively. The study revealed statistically significant positive cross-lagged associations, at every time-point –– from cannabis use to psychosis symptoms reported 12 months later, over and above the random intercepts of cannabis use and psychosis symptoms (between-person differences). The statistical significances varied from P < .001 to P < .05. Cannabis use, in any given year, predicted an increase in psychosis symptoms a year later, said Conrod. This type of analysis is more reliable than biological measures, such as blood tests, said Conrod. "Biological measures aren't sensitive enough to the infrequent and low level of use that we tend to see in young adolescents," she said. In light of these results, Conrod called for increased access by high school students to evidence-based cannabis prevention programs. Such programs exist, but there are no systematic efforts to make them available to high school students across the country, she said. "It's extremely important that governments dramatically step up their efforts around access to evidence-based cannabis prevention programs," she said. Currently, marijuana use in teens is "very prevalent," she said. Surveys suggest that about 30% of older high school students in the Canadian province of Ontario use cannabis. "I'd like to see governments begin to forge some new innovative policy that will address this level of use in the underaged," Conrad said. Reducing access to, and demand for, cannabis among youth could of course lead to reductions in risk for major psychiatric conditions, she said. A limitation of the study was that cannabis use and psychosis symptoms were self-reported and were not confirmed by clinicians. However, as the authors note, previous work has shown positive predictive values for such self-reports of up to 80%. Unique Research Commenting on the findings for Medscape Medical News, Robert Milin, MD, child and adolescent psychiatrist, addiction psychiatrist, and associate professor of psychiatry, University of Ottawa, said the study is at "the vanguard" of major research investigating cannabis use in adolescents over time that is being carried out by that National Institute on Drug Abuse in the United States. "The study is at the forefront because it is specifically looking to measure psychosis symptoms and cannabis use in adolescents, and the model they are using strengthens the study," said Milin. That model uses "refined measures or improved measures to look at causality, vs what we call temporal associations," he said. The fact that the study investigated teens starting at age 13 years is also unique, said Milin. In most related studies, the starting age of the participants is 15 or 16 years. He emphasized that the study examined psychosis symptoms and not psychotic disorder, although having psychotic symptoms increases the risk for a psychotic disorder. The study was supported by grants from the Canadian Institutes of Health Research. Dr Conrod and Dr Milin have disclosed no relevant financial relationships. JAMA Psychiatry. Published online June 6, 2018. Abstract

Posted in Center for Environmental Genetics | Comments Off on All Young Cannabis Users Face Higher Risk of Psychosis

What allows TEMPERATURE (of the environment) to influence the developmental pathways that determine sex?

It is well known that the sex (i.e. whether it’s male or female gender) of certain reptiles can be determined by ambient temperatures of the environment, although the precise molecular mechanism has remained elusive. What is it –– that allows TEMPERATURE to influence the developmental pathways that determine sex? The experiments to identify a master sex-determining gene in species (those having the ability of genetic sex determination) are well established: One identifies genes on the sex chromosomes, demonstrates which of these are differentially expressed in male and female embryos early in development, and then manipulates their expression so as to demonstrate reversal of sex.

This approach does not work, however, when identifying mechanisms of temperature-dependent sex determination (TSD). Differences in temperature can exert its effect on any of the many autosomal (i.e. those chromosomes that are not sex chromosomes) genes involved in sexual differentiation –– even those peripherally involved, provided their altered expression is capable of reversing sex. Thus, although TSD was discovered in reptiles in 1966, we have not understood the mechanisms of TSD. However, authors [see attached full article] ound that transcription of the chromatin-modifier gene Kdm6b (lysine-specific demethylase-6B) responds to temperature in the red-eared slider turtle, conferring temperature sensitivity to a sex-determining gene, Dmrt1 (doublesex- and mab-3–related transcription factor-1).

Dmrt1 expression is high at male-producing temperature (MPT) and low at female-producing temperature (FPT). As such, Dmrt1 became a strong candidate for the male sex-determining gene in this TSD species of turtle, consistent with the master sex-determining role of other “DM-domain–containing” genes in certain fish, amphibians, and birds. These “DM-domain” genes initiate and maintain the male sexual trajectory, while suppressing genes important for female development –– during critical stages of embryogenesis.

Authors showed [see attached] that experimental down-regulation of Kdm6b at 26°C (normally the MPT) shifts embryos from a male to a female developmental trajectory –– because the protein KDM6B plays a central role in epigenetic regulation of gene expression. Suppressing Kdm6b expression decreases demethylation of its target, trimethylated lysine 27 on histone 3 (H3K27), a histone modification that would otherwise repress Dmrt1 promoter activity. Thus, high amounts of KDM6B at MPT activate Dmrt1 gene expression and determine male sex, whereas decrreased amounts of KDM6B repress Dmrt1 expression, resulting in female.

These GEITP pages have often described “epigenetic effects” as “non-DNA-sequence effects that contribute to a trait” (phenotype) –– and which include DNA-methylation, RNA-interference, histone modifications, and chromatin remodeling. This [attached] paper shows convincing evidence of a role in TSD for highly conserved epigenetic modifiers such as KDM6B. This exciting study therefore establishes causality, and a direct genetic link between epigenetic mechanisms (histone modifications) and temperature-dependent sex determination (i.e. a phenotype) in a turtle species.

Science 11 May 2o18; 360: 645–648 [main article] & pp 601–602 [editorial]

COMMENT:Very interesting. It would be nice to see if KDM6B plays the same role in certain fish species in which temperature also determines sex…

COMMENT: Excellent comment/question, Linda, i.e. does the ‘same set of master genes and downstream cascade of events’ –– which is seen in this turtle –– also operate in some, or all, of the other poikilotherms (animals that cannot regulate their body temperature –– except by behavioral means such as basking-in-the-sun or burrowing-into-the-ground)?

And, an even “broader, more universal” question: does this same set of master genes and downstream cascade of events also occur in homeotherms (animals that are able to maintain their body temperature at a constant level –– usually above that of the environment –– by their metabolic activity)? And, if not, WHY NOT (i.e. what modifications have evolved in homeotherms –– during the past several hundred million years –– to alter this set of master genes found in this turtle)?

These articles [attached] say that temperature-dependent determination of sex “occurs in certain reptiles, amphibians, fish and birds.” Birds are not poikilotherms, but they do lay eggs that need to be incubated for a period of time before hatching; sex is determined during embryogenesis and fetogenesis, and therefore these processes would take place during the incubation time of the egg. And what about monotremes (egg-laying mammals such as the opossum, duck-billed platypus)? These papers open up far more questions than it answers.

COMMENT: In matters of sex determination, I think this [attached] is a very interesting paper that you might find relevant to the current ongoing discussion.

COMMENT: A question was entertained earlier: Are mammals different from the turtle with regard to sex determination? The attached article describes sex reversal (in the mouse) following deletion of a single distal enhancer of the Sox9 gene. In mammals, the Sry (sex-determining region-Y) gene encodes the protein SRY that is transiently expressed and initiates testis, and subsequent male development, by triggering cells of the supporting cell lineage to differentiate into Sertoli cells (male testis cells), rather than granulosa cells typical of ovaries. Sox9, (SRY-box 9) the main target of SRY, is critical for differentiation of Sertoli cells, subsequently functioning along with other transcription factors –– notably encoded by the genes Sox8 (SRY-box 8) and then Dmrt1 (doublesex- and mab-3–related transcription factor-1, i.e. THE SAME GENE AS WAS DESCRIBED BELOW FOR THE TURTLE..!!).

Both gain-of-function and loss-of-function studies in mouse and human demonstrate that Sox9 plays a key role in testis determination. Notably, in humans, males that are heterozygous (one gene copy of the gene pair) for null mutations in SOX9 develop campomelic dysplasia, a severe syndrome in which 70% of XY patients show female development. Authors [see attached] discovered –– via in vivo high-throughput chromatin-accessibility techniques, transgenic assays, and genome-editing –– several novel sex-determining regulatory elements in the 2-million-base-pair (2 Mb; 2 megabase) “gene desert” upstream of Sox9. Although others are redundant, Enh13, a 557–base-pair element located 565,000 base-pairs (565 kilobases; kb; kbp) 5′, is essential to initiate mouse testis development; deletion of this element gives XY females with Sox9 transcript levels equivalent to XX gonads (organs that produce ova, not sperm). These results are consistent with the time-sensitive activity of SRY, during development, and indicate a strict order of enhancer usage. Clinically, ENH13 is conserved and embedded within a 32.5-kb region whose deletion in patients is associated with XY sex reversal (i.e. changing from male to female) –– suggesting this enhancer is critical not only in mice, but in humans.

Posted in Center for Environmental Genetics | Comments Off on What allows TEMPERATURE (of the environment) to influence the developmental pathways that determine sex?

Whole-genome sequencing (WGS) studies: Implications for autism spectrum disorder

Whole-genome sequencing (WGS) is becoming increasingly common for investigating the full spectrum of genetic variation, associated with relatively common complex diseases (e.g. obesity, cancer, autism spectrum disorder), but the challenges in interpreting data are considerable. Remember that complex diseases –– just like most instances of drug efficacy or toxicity, or genetic differences in response to an environmental toxicant –– are multifactorial traits (representing contributions from hundreds, if not thousands, of genes, plus epigenetic factors, plus environmental effects).

The primary motivation for a WGS study is to understand whether structural, rare, and de novo DNA variants (also called nucleotide changes; single-nucleotide variants, SNVs; mutations) –– in the noncoding genome –– contribute to the cause of the disease etiology –– in addition to the more well-understood contribution from mutations in the coding genome. Keep in mind that “the human (i.e. coding) exome” comprises ~180,000 exons, which represents ~30 megabases (Mb, or million bases), or about 1% of the total genome. This means the noncoding genome represents the remaining 99% of the human genome..!! Nevertheless, DNA variants in the human exome are believed to harbor ~85% of the mutations that exhibit large-effect contributions to disease.

Authors [see attached study & editorial] provide the first serious attempt to establish a framework for enrichment analyses of rare noncoding variation in WGS studies of common complex diseases. They evaluated, by way of WGS, rare and de novo noncoding SNVs, insertions/deletions (indels), and all classes of structural variations (e.g. copy-number variations, CNVs; large rearrangements; repetitive types of DNA). Integrating genomic annotations at the level of nucleotides, genes, and regulatory regions –– authors defined 51,801 annotation categories..!!

Analysis of 519 autism spectrum disorder (ASD) families (i.e. unaffected vs affected members) did NOT identify any particular association with any categories –– after authors corrected for 4,123 effective tests (this contradicts a number of previously published ASD studies) Without appropriate correction, biologically plausible associations are observed –– but in both cases and controls. Despite excluding previously identified gene-disrupting mutations, coding regions still DO exhibit the strongest associations with ASD. Thus, in autism, the contribution of de novo variats in the noncoding region is probably modest, in comparison to that of de novo coding variants. Robust results from future WGS studies will require even larger cohorts and comprehensive analytical strategies that consider this substantial burden of multiple-testing. Prediction of risk of ASD is extremely unlikely, whereas identification of future drug targets for treating ASD remains a possibility.

Nature Genetics May 2o18; 50: 727–736 & 635–637 [News’N’Views editorial]

COMMENT:
Notice that our Ko et al. 2016 paper [Stem Cells 2o16; 34: 2826–2839; see attached] concluded, among other things, that AHR-expressing embryonic stem (ES) cells restrict cardiogenesis (i.e. can suppress transformation of some of these cells from being commited to heart cell formation) and, instead, commit to a neuroglial cell fate (i.e. transform some of these cells into a commitment to microglia cell formation). It thus appears that AHR expression needs to be repressed in order to maintain ES cell mitotic progression (repeated cell divisions) and to prevent premature loss of pluripotency (changing to tissue-specific cell-types).

Posted in Center for Environmental Genetics | Comments Off on Whole-genome sequencing (WGS) studies: Implications for autism spectrum disorder

Microglial control of astrocytes in response to microbial metabolites: AHR participates in CNS inflammation (e.g. multiple sclerosis)

Authors [see attached article] identified positive and negative regulators that mediate the means by which microglia [cells in the glia of brain that function as macrophages (scavengers) in the central nervous system (CNS)] control astrocytes (star-shaped glial cells in the CNS). These exciting results define a pathway through which microbial metabolites [derived from bacteria in the gastrointestinal (GI) tract] limit pathogenic activities of microglia and astrocytes –– thereby resulting in the suppression of CNS inflammation. Authors suggest that this pathway may lead to new therapies for multiple sclerosis and other (inflammatory) neurological disorders.

That darned aryl hydrocarbon receptor (AHR) transcription factor just seems to keep popping up in the middle of every critical-life function. J Microglia in the CNS have been known to express AHR. To investigate the role of microglial AHR on CNS inflammation, authors generated a transgenic mouse in which a tamoxifen-inducible promoter drives expression of Cre recombinase fused to an estrogen ligand-binding domain. After treatment of these mice with tamoxifen, AHR-expressing peripheral cells are replenished from bone marrow –– while microglia remain AHR-deficient without any abnormal cell death. Microglial AHR deletion led to worsened conditions of experimental autoimmune encephalomyelitis (EAE), which led to increasing demyelination and CNS monocyte recruitment; the T-cell response remained unaffected. Collectively, these findings suggest that microglial AHR limits (i.e. is able to suppress) EAE. NF-κB (a protein complex that controls transcription of DNA, cytokine production, and cell survival) controls microglial responses during EAE, and AHR can limit NF-κB activation in a SOCS2-dependent manner (SOCS2 = suppressor of cytokine signaling-2, a key regulator of growth hormone, insulin-like growth factor, and other signaling pathways implicated in inflammation and cancer). Deletion of microglial AHR thus caused decreases in Socs2 expression and resulted in up-regulation of transcripts associated with microglial activation, inflammation, and neurodegeneration.

As previously discussed recently in these GEITP pages, AHR is known to be associated with “reception of environmental, as well as endogenous, signals” –– including the stress signal of inflammation. This resulting in a cascade of downstream events programmed to respond to those incoming signals (in ways to promote cell and organism survival) [recently reviewed in: Progr Lipid Res 2o17; 67: 38]. Below is Table 1 from that review. As can be see, as a member of the bHLH/PAS family, AHR participates in virtually all fundamental/developmental and critical-life functions in the living organism:

Table 1

Summary of organs, systems, cell functions, and developmental biology in which AHR-signaling is involved.

Location AHR-signaling pathway involvement
Central nervous system Development of brain and nervous system; Neurogenesis; Neuronal cell development; Cardiorespiratory brainstem development in ventrolateral medulla; “Brain-gut-microbiome”
Eye Ciliary body formation and function; Thyroid-associated eye disease
Gastrointestinal tract Development of GI tract; Rectal prolapse during aging; “Brain-gut-microbiome”
Heart Development of heart organ; Cardiovascular physiology; Atherogenesis; Cardiomyogenesis; Cardiorespiratory function
Hematological system Development of blood cell-forming system; Hematopoiesis; Activation or suppression of erythroid development
Immune system Immune system development; The immune response; Innate immunity; Pro-inflammatory response; Anti-inflammatory response; Immunomodulatory effects
Inner ear Development of the cochlea
Kidney Development of the kidney; Hypertension
Liver Development of liver organ; Hyperlipidemia; Glucose and lipid metabolism; Hepatic steatosis
Musculoskeletal system Transmesoderm → osteoblast transition; Bone formation; Osteoclastogenesis
Pancreas Development of pancreas; Beta-cell regulation; Pancreatic fibrosis
Endocrine system Serum lowered testosterone levels; Infertility; Mammary gland duct cell epithelial hyperplasia; Degenerative changes in testis; Gerrm-cell apoptosis; Endometriosis
Reproductive system Development of male and female sex organs; Spermatogenesis; Fertility
Respiratory tract Development of respiratory tract; Disruption of GABA-ergic transmission defects; Cardiorespiratory function
Vascular system Angiogenesis; Atherosclerotic plaque formation
Skin Barrier physiology; Atopic dermatitis
Cellular functions Cell migration; Cell adhesion; Circadian rhythmicity
DNA changes DNA synthesis; DNA repair; DNA-adduct formation; Mutagenesis

COMMENT: Yes, this process is described in detail [see sections 5.3.4 and 5.3.5 of attached Progr Lipid Res review], and the Ko et al. paper is cited therein.

DwN

COMMENT: AP Sent: Tuesday, June 12, 2018 8:13 AM
RE: Microglial control of astrocytes in response to microbial metabolites: AHR participates in CNS inflammation (e.g. multiple sclerosis)

Notice that our Ko et al. 2016 paper [Stem Cells 2o16; 34: 2826–2839; see attached] concluded, among other things, that AHR-expressing embryonic stem (ES) cells restrict cardiogenesis (i.e. can suppress transformation of some of these cells from being commited to heart cell formation) and, instead, commit to a neuroglial cell fate (i.e. transform some of these cells into a commitment to microglia cell formation). It thus appears that AHR expression needs to be repressed in order to maintain ES cell mitotic progression (repeated cell divisions) and to prevent premature loss of pluripotency (changing to tissue-specific cell-types).

Posted in Center for Environmental Genetics | Comments Off on Microglial control of astrocytes in response to microbial metabolites: AHR participates in CNS inflammation (e.g. multiple sclerosis)

Whole-genome sequencing (WGS) studies: Implications for autism spectrum disorder

Whole-genome sequencing (WGS) is becoming increasingly common for investigating the full spectrum of genetic variation, associated with relatively common complex diseases (e.g. obesity, cancer, autism spectrum disorder), but the challenges in interpreting data are considerable. Remember that complex diseases –– just like most instances of drug efficacy or toxicity, or genetic differences in response to an environmental toxicant –– are multifactorial traits (representing contributions from hundreds, if not thousands, of genes, plus epigenetic factors, plus environmental effects).

The primary motivation for a WGS study is to understand whether structural, rare, and de novo DNA variants (also called nucleotide changes; single-nucleotide variants, SNVs; mutations) –– in the noncoding genome –– contribute to the cause of the disease etiology –– in addition to the more well-understood contribution from mutations in the coding genome. Keep in mind that “the human (i.e. coding) exome” comprises ~180,000 exons, which represents ~30 megabases (Mb, or million bases), or about 1% of the total genome. This means the noncoding genome represents the remaining 99% of the human genome..!! Nevertheless, DNA variants in the human exome are believed to harbor ~85% of the mutations that exhibit large-effect contributions to disease.

Authors [see attached study & editorial] provide the first serious attempt to establish a framework for enrichment analyses of rare noncoding variation in WGS studies of common complex diseases. They evaluated, by way of WGS, rare and de novo noncoding SNVs, insertions/deletions (indels), and all classes of structural variations (e.g. copy-number variations, CNVs; large rearrangements; repetitive types of DNA). Integrating genomic annotations at the level of nucleotides, genes, and regulatory regions –– authors defined 51,801 annotation categories..!!

Analysis of 519 autism spectrum disorder (ASD) families (i.e. unaffected vs affected members) did NOT identify any particular association with any categories –– after authors corrected for 4,123 effective tests (this contradicts a number of previously published ASD studies) Without appropriate correction, biologically plausible associations are observed –– but in both cases and controls. Despite excluding previously identified gene-disrupting mutations, coding regions still DO exhibit the strongest associations with ASD. Thus, in autism, the contribution of de novo variats in the noncoding region is probably modest, in comparison to that of de novo coding variants. Robust results from future WGS studies will require even larger cohorts and comprehensive analytical strategies that consider this substantial burden of multiple-testing. Prediction of risk of ASD is extremely unlikely, whereas identification of future drug targets for treating ASD remains a possibility.

DwN

Nature Genetics May 2o18; 50: 727–736 & 635–637 [News’N’Views editorial]

Posted in Center for Environmental Genetics | Comments Off on Whole-genome sequencing (WGS) studies: Implications for autism spectrum disorder

Plain-language medical vocabulary for “precision diagnosis”

Following up on the most recent GEITP email about “precision medicine” or “personalized medicine,” this brief letter [see attached] concerns “precision diagnosis.” Often discussed in these GEITP pages have been genome-wide association studies (GWAS) in which one or more genes (genotype) is identified as “being associated with” a particular trait (phenotype). If one has a study population (cohort) of 10,000 having Trait A, compared with a control cohort of 10,000 without Trait A –– this is what is called a “genotype-phenotype association study”.

For a simple example, let’s say we have chosen to study a cohort with “blue eyes”, i.e. what gene(s) are associated with this trait? (And the control cohort is “brown eyes.”) If, within the 10,000 individuals purported to have blue eyes, there are 10 subjects misdiagnosed because they have green eyes. The results of this study would therefore be “fuzzy,” or “tainted,” by this “lack of precision diagnosis,” or “increased noise.” Many genomicists refer to this as an equivocal phenotype. Everyone would prefer to have a study cohort having only an unequivocal phenotype. The same can be said about any trait –– type-2 diabetes, schizophrenia, breast cancer, autism spectrum disorder –– if some in the study cohort are misdiagnosed, then the results of the study are plagued with decreased statistical power.

Perhaps especially for patients not yet diagnosed, and those with rare diseases, the affected individuals themselves are an especially critical source of phenotyping information. These patients live with their condition, and develop explicit and implicit knowledge about it –– whether from multiple-clinician evaluations or from other families and patients experiencing diagnosis for similar conditions. Many web sites have been set up, over the past three decades, for those with a common ailment to share their grief/concern/advice/comments/questions. From these interactions, they develop a lexicon of relevant terms; these terms are frequently in plain layman language, but can also include clinical terms.

Human genetics and precision medicine therefore aim to understand the relationship between genetic variants and diseases. Whole-exome seuencing (WES) and whole-genome sequencing (WGS) have transformed the ability comprehensively to characterize genetic variants. Although WES and WGS have led to the discovery of many novel disease-associated genes, the diagnostic yield in patients without a clear clinical diagnosis has been 11% to 25%. The Human Phenotype Ontology (HPO) was created [Am J Hum Genet 2oo8; 83: 610] to enable ‘deep phenotyping’, i.e. capture of symptoms and phenotypic findings using a logically-constructed hierarchy of phenotypic terms. The HPO has become the de facto standard for representing clinical phenotype data to inform diagnoses for rare genetic diseases by the 100,000 Genomes Project, the NIH Undiagnosed Diseases Program (UDP), and Undiagnosed Diseases Network (UDN), as well as thousands of other clinics, laboratories, tools, and databases. Further details are described in the attached letter. 🙂

DwN

Nature Genetics Apr 2o18; 50: 474–476

Posted in Center for Environmental Genetics | Comments Off on Plain-language medical vocabulary for “precision diagnosis”