Genome-wide association study in Europeans (N=176,678) reveals genetic loci for tanning response to sun exposure

Repeated exposure to the sun is well known to be associated with increased risk of all skin cancers, including cutaneous malignant melanoma (CMM), basal cell carcinoma and squamous cell carcinoma, and these types of cancer are more common in fair-skinned, rather than darker-skinned, people. The tanning response after exposure to sunlight is also well known to be mainly determined by melanin pigmentation, which aims at protecting the skin from DNA photo-damage. Genome-wide association studies (GWAS) on European populations have previously identified several DNA variants in or near seven genes –– ASIP (agout-signaling protein), EXOC2, (exocyst-complex component-2) HERC2 (ECT- and RLD-domain-containing E3 ubiquitin protein ligase-2), IRF4 (interferon regulatory factor-4), MC1R (melanocortin-1 receptor), SLC45A2 (solute-carrier family 45, member-2), and TYR (tyrosinase). These seven genes are known to be associated with both pigmentation-related traits (e.g. hair, eye or skin color) and skin cancer.

Authors [see attached] chose to investigate further the genetic basis of skin-tanning and the effect on skin cancer susceptibility (i.e. by starting with a much larger cohort –– that should ‘find’ additional significant genetic loci) –– by performing large-scale GWAS using data from the UK Biobank (N = 176,678 subjects of European ancestry). They identified significant associations with tanning ability at 20 loci –– confirming previously identified associations at six of these previous loci, and reporting 14 novel loci (ten of these loci have never before been associated with pigmentation-related phenotypes).

In addition to identifying and replicating genes previously associated with ease of skin-tanning or pigmentation-related phenotypes (traits), authors (intriguingly to me) demonstrated a genetic correlation between ease of skin-tanning as well as risk of non-melanoma skin cancer with DNA variants at the AHR/AGR3 locus. These two genes –– AHR (aryl hydrocarbon receptor) and AGR3 (anterior-gradient-3, protein disulfide isomerase family member) reside next to one another on human chromosome 7p21.1.

The former gene, AHR (first discovered by yours truly and Alan Poland in 1974) is known to be associated with “reception of environmental, as well as endogenous, signals”, resulting in a cascade of downstream events programmed to respond to those incoming signals (in ways to promote cell and organism survival). In the case of “sunlight” as the signal, undoubtedly the “genetic response” includes cell cycle genes and DNA-repair genes [recently reviewed in: Progr Lipid Res 2o17; 67: 38].

I didn’t know if it was possible until I tried –– but I see it IS possible to download from that journal article and paste below Table 1 and Figures 8 & 9 from that elegant review. 🙂 As can be see, as a member of the bHLH/PAS family, AHR participates (lends a helping hand) in for virtually all fundamental/developmental and critical-life functions in the living organism.

Table 1

Summary of organs, systems, cell functions, and developmental biology in which AHR-signaling is involved.

Location AHR-signaling pathway involvement

Central nervous system Development of brain and nervous system; Neurogenesis; Neuronal cell development; Cardiorespiratory brainstem development in ventrolateral medulla; “Brain-gut-microbiome”

Eye Ciliary body formation and function; Thyroid-associated eye disease

Gastrointestinal tract Development of GI tract; Rectal prolapse during aging; “Brain-gut-microbiome”

Heart Development of heart organ; Cardiovascular physiology; Atherogenesis; Cardiomyogenesis; Cardiorespiratory function

Hematological system Development of blood cell-forming system; Hematopoiesis; Activation or suppression of erythroid development

Immune system Immune system development; The immune response; Innate immunity; Pro-inflammatory response; Anti-inflammatory response; Immunomodulatory effects

Inner ear Development of the cochlea

Kidney Development of the kidney; Hypertension

Liver Development of liver organ; Hyperlipidemia; Glucose and lipid metabolism; Hepatic steatosis

Musculoskeletal system Transmesoderm → osteoblast transition; Bone formation; Osteoclastogenesis

Pancreas Development of pancreas; Beta-cell regulation; Pancreatic fibrosis

Endocrine system Serum lowered testosterone levels; Infertility; Mammary gland duct cell epithelial hyperplasia; Degenerative changes in testis; Gerrm-cell apoptosis; Endometriosis

Reproductive system Development of male and female sex organs; Spermatogenesis; Fertility

Respiratory tract Development of respiratory tract; Disruption of GABA-ergic transmission defects; Cardiorespiratory function

Vascular system Angiogenesis; Atherosclerotic plaque formation

Skin Barrier physiology; Atopic dermatitis

Cellular functions Cell migration; Cell adhesion; Circadian rhythmicity

DNA changes DNA synthesis; DNA repair; DNA-adduct formation; Mutagenesis

Oxidative stress Mitochondrial ROS formation; Anti-oxidant protection against ROS formation; Mitochondrial H2O2 production; Crosstalk with hypoxia and HIFsignaling pathways; Transforming growth factor- signaling pathways; MID1-PP2A-CDC25B-CDK1 signaling pathway regulating mitosis

Tumor cells Growth suppression; Tumor initiation; Tumor promotion

ES cell basic functions Ectoderm → epithelium transition; Cell adhesion; Cell-cycle regulation; Apoptosis; Cavitation during morula →blastula formation; Activator of Rho/Rac GTPases; WNT-signaling pathways; Homeobox-signaling pathways

Other basic functions Transgenerational inheritance; Epigenetic effects; Chromatin remodeling; Histone modification; Aging-related and degenerative diseases

Nature Commun 2o18; 9: 1684

Posted in Center for Environmental Genetics | Comments Off on Genome-wide association study in Europeans (N=176,678) reveals genetic loci for tanning response to sun exposure

Pharmacogenomics of GPCR Drug Targets — Data-Mining Experiment, rather than a Genome-Wide Association Study

COMMENT: Yes, Dan –– This is a very key point in precision medicine.

Many of the examples that you mention are likely to be modulated by the immune system, with completely unanticipated immune reactions against the whole drug molecule. If it is a large molecular-weight drug (such as abacavir), it can bind directly to specific HLA types [Human leukocyte antigen genes (HLA) encoding the major histocompatibility complex (MHC) proteins in humans; these cell-surface proteins are responsible for regulation of the immune system) –– located on the surface of antigen-presenting cells for T-cell activation. Alternatively, if it is a small-molecular weight drug, the antibody that is produced can be directed against a hapten (a small molecule that, when combined with a larger carrier, e.g. a protein, can elicit the production of antibodies that bind specifically to it), which means the drug binds covalently with a peptide.

At the present time, there are about 40 different HLA alleles that have been associated with increased risk of ADRs caused by different drugs. To a smaller degree, I think adverse drug reactions can be determined by rare genetic variants [abstract of recent article (Hum Genomics 2o18; 12: 26) pasted below].

Integrating rare genetic variants into pharmacogenetic drug response predictions

Ingelman-Sundberg M, Mkrtchian S, Zhou Y, Lauschke VM.

BACKGROUND: Variability in genes implicated in drug pharmacokinetics or drug response can modulate treatment efficacy or predispose to adverse drug reactions. Besides common genetic polymorphisms, recent sequencing projects revealed a plethora of rare genetic variants in genes encoding proteins involved in drug metabolism, transport, and response.

RESULTS: To understand the global importance of rare pharmacogenetic gene variants, we mapped the variability in 208 pharmacogenes by analyzing exome sequencing data from 60,706 unrelated individuals and estimated the importance of rare and common genetic variants using a computational prediction framework optimized for pharmacogenetic assessments. Our analyses reveal that rare pharmacogenetic variants were strongly enriched in mutations predicted to cause functional alterations. For more than half of the pharmacogenes, rare variants account for the entire genetic variability. Each individual harbored on average a total of 40.6 putatively functional variants, rare variants accounting for 10.8% of these. Overall, the contribution of rare variants was found to be highly gene- and drug-specific. Using warfarin, simvastatin, voriconazole, olanzapine, and irinotecan as examples, we conclude that rare genetic variants likely account for a substantial part of the unexplained inter-individual differences in drug metabolism phenotypes.

CONCLUSIONS: Combined, our data reveal high gene and drug specificity in the contributions of rare variants. We provide a proof-of-concept on how this information can be utilized to pinpoint genes for which sequencing-based genotyping can add important information to predict drug response, which provides useful information for the design of clinical trials in drug development and the personalization of pharmacological treatment.

COMMENT This topic falls precisely under the heading of “gene-environment (G x E) interactions.” Or better yet, in the case of this paper, “drug-genome interactions.” This is among the most intriguing mysteries in all of clinical pharmacology: a drug is given to patient A and it works as expected (efficacy), but given to patient B, the drug causes an adverse drug reaction (ADR), and given to patient C, there is no beneficial or toxic effect (therapeutic failure). How does a small-molecular-weight drug — given to some patients, but not the majority of patients in any population — cause an ADR that is often indistinguishable from a complex disease?

For example, sitagliptin is approved by the FDA to treat type-2 diabetes; yet, a small subset taking the recommended prescribed dose develops acute pancreatitis. Hydroxychloroquine — given to treat malaria, lupus erythematosus, or rheumatoid arthritis — can also lead to acute pancreatitis in some patients. In a subpopulation of patients receiving many psychotropic drugs (e.g. valproic acid), undesirable weight gain can occur as a dose-independent ADR; in another small subset, hepatic steatosis (fatty liver) has been found. In a small subpopulation of patients taking bisphosphonates for osteoporosis, increased risk of esophageal and gastric cancer has been reported in a number of studies; however, a large meta-analysis of this association has found no significantly increased risk [Wright et al., BMJ Open 2015; 5: e007133].

DwN

COMMENT: Hi Dan, The most remarkable finding in this Cell paper, I think, is the shift of the mu-opioid receptor, by a rare mutation, to respond to naloxone as an agonist instead of antagonist. However, more wet-lab experiments are needed to verify some of these key findings.

PREVIOUS POST

The [attached] article is a bombshell report and, to our knowledge, represents the first study of its kind. Rather than a genome-wide association study (GWAS), authors performed an avante gard data-mining in silico approach — to search for DNA variants in or near each of the 108 G-protein-coupled receptor genes (GPCRs) known to exist in the human genome. In the field of pharmacology and drug response, these 108 genes are the known targets of 475 prescription drugs that have been approved by the U.S. Food and Drug Administration (FDA). These 475 drugs, which comprise ~34% of all prescription drugs, account for a global sales volume of >US$180 billion annually..!!

Each of the genomes of almost 68,500 individuals was separately investigated for missense variants in and near each of the GPCR genes. Then the authors searched the literature for the clinical associations with altered drug response in these individuals. To estimate the de novo missense mutation rate within these GPCR genes, authors in addition identified novel mutations from >1,700 control trios (having no reported pathological conditions) –– which were compiled from ten different studies registered in the “denovo-database,” an intriguing collection of germline de novo variants (http://denovo-db.gs.washington.edu/denovo-db/).

To demonstrate proof-of-principle, authors then experimentally showed that certain variants of the mu-opioid and cholecystokinin receptors resulted in altered drug responses and/or idiosyncratic dose-independent adverse drug reactions. These amazing results — on just two of the 108 GPCR genes — underscore the need to characterize DNA variants among all 108 of the GPCR genes. Authors suggest that “the ultimate results of this novel type of in silico study might enhance prescription precision, improve patients’ quality-of-life, and remove some of the economic and societal burden caused by variability in drug response.”

We anticipate that such “dry-lab” data-mining studies, i.e. just sitting in front of a computer and searching databases online — such as this landmark publication [attached] — are likely to become a major new way to approach pharmacogenomics research in the near future..!! J

DwN

Cell Jan 2o18; 172: 41–54

Posted in Center for Environmental Genetics | Comments Off on Pharmacogenomics of GPCR Drug Targets — Data-Mining Experiment, rather than a Genome-Wide Association Study

Pharmacogenomics of GPCR Drug Targets — Data-Mining Experiment, rather than a Genome-Wide Association Study

The [attached] article is a bombshell report and, to our knowledge, represents the first study of its kind. Rather than a genome-wide association study (GWAS), authors performed an avante gard data-mining in silico approach — to search for DNA variants in or near each of the 108 G-protein-coupled receptor genes (GPCRs) known to exist in the human genome. In the field of pharmacology and drug response, these 108 genes are the known targets of 475 prescription drugs that have been approved by the U.S. Food and Drug Administration (FDA). These 475 drugs, which comprise ~34% of all prescription drugs, account for a global sales volume of >US$180 billion annually..!!

Each of the genomes of almost 68,500 individuals was separately investigated for missense variants in and near each of the GPCR genes. Then the authors searched the literature for the clinical associations with altered drug response in these individuals. To estimate the de novo missense mutation rate within these GPCR genes, authors in addition identified novel mutations from >1,700 control trios (having no reported pathological conditions) –– which were compiled from ten different studies registered in the “denovo-database,” an intriguing collection of germline de novo variants (http://denovo-db.gs.washington.edu/denovo-db/).

To demonstrate proof-of-principle, authors then experimentally showed that certain variants of the mu-opioid and cholecystokinin receptors resulted in altered drug responses and/or idiosyncratic dose-independent adverse drug reactions. These amazing results — on just two of the 108 GPCR genes — underscore the need to characterize DNA variants among all 108 of the GPCR genes. Authors suggest that “the ultimate results of this novel type of in silico study might enhance prescription precision, improve patients’ quality-of-life, and remove some of the economic and societal burden caused by variability in drug response.”

We anticipate that such “dry-lab” data-mining studies, i.e. just sitting in front of a computer and searching databases online — such as this landmark publication [attached] — are likely to become a major new way to approach pharmacogenomics research in the near future..!! J

DwN

Cell Jan 2o18; 172: 41–54

Posted in Center for Environmental Genetics | Comments Off on Pharmacogenomics of GPCR Drug Targets — Data-Mining Experiment, rather than a Genome-Wide Association Study

Einstein’s quotes — which ones are true and which are not?

This (somewhat tongue-in-cheek-humorous) one-page book report –– about “which quotes REALLY DID originate from Einstein, and which ones are attributed falsely to him?” [attached] is worth sharing with all GEITP’ers. Beyond his towering contributions to Physics, Albert Einstein was an avid commentator on Education, Marriage, Money, the Nature of Genius, Music-Making, Politics, and more. His insights were legendary, as we are reminded by the recent publication of Volume 15 in The Collected Papers of Albert Einstein. Even the website of the U.S. Internal Revenue Service enshrines his words (as quoted by his accountant): “The hardest thing in the world to understand … is the income tax.”

“There appears to be a bottomless pit of quotable gems to be mined from Einstein’s enormous archives,” notes Alice Calaprice, editor of The Ultimate Quotable Einstein (2011), but there might be a hint of despair in her comment. Indeed –– Einstein could be the “most quoted scientist in history”. The website Wikiquote has many more entries attributed to Einstein –– than for Aristotle, Galileo Galilei, Isaac Newton, Charles Darwin, or Stephen Hawking, and even more than Ein­stein’s opinionated contemporaries Winston Churchill and George Bernard Shaw. However, how much of this super-abundance actually emanated with certainty from Einstein? See the attached article to find out. 🙂

Nature 3 May 2o18 557: 30

Posted in Center for Environmental Genetics | Comments Off on Einstein’s quotes — which ones are true and which are not?

Evolutionary origin of mitochondria predates the origin of Alphaproteobacteria

Mitochondria are known as the “energy factories” of most (animal) cells. Earliest bacteria that had originated on the planet do not have mitochondria, and have only one chromosome (haploid, seen in all prokaryotes) rather than chromosome-pairs (diploid, seen in all eukaryotes). Most eukaryotic cells have a nucleus, and then the cytoplasm outside the nucleus –– which contains many types of organelles –– including hundreds of mitochondria. These mitochondria are involved in various processes, of which ATP generation by means of oxidative phosphorylation (used by the cell to generate energy for itself and other cells) is a hallmark feature. Structures in plants are similar to mitochondria and called chloroplasts.

“The Endosymbiotic Theory” describes how the fusion of a large host eukaryotic cell, with one or more ingested bacteria, could easily have become dependent on one another for survival (hence, this is a topic of gene-environment interactions), resulting in a permanent relationship. After more than 2 billion years of evolution, mitochondria and chloroplasts have become more specialized, and today they cannot live outside the cell. In humans, while there are >20,000 genes in the nuclear genome, there are at least 37 important genes in the mitochondrial genome. During fertilization, a sperm (derived usually from the man) combines with an egg (derived usually from the woman), and only the egg has mitochondria in its cytoplasm. When there is a serious defect in a gene of the woman’s mitochondrial genome, today there is clinically successful in vitro fertilization scheme –– in which the male’s sperm, and nucleus of the female’s egg, is combined with mitochondria from a healthy woman –– following which a healthy baby is formed, derived from three persons.

To trace the evolutionary history of mitochondria and their role in the genesis of eukaryotes, detailed knowledge about the identity and nature of the mitochondrial ancestor is important. Alphaproteobacteria is a distinct Class of bacteria (which evolved later than early bacteria) in the Proteobacteria phylum; its members are highly diverse, some Taxa contain mitochondria, and certain Alphaproteobacteria can cause specific human (and agricultural) diseases, but nevertheless they share a common evolutionary ancestor. Despite the fact that the origin of mitochondria in Alphaproteobacteria is generally undisputed, efforts to resolve the phylogenetic position of mitochondria in the Alphaproteobacterial species tree have failed to reach an agreement.

Whereas most studies support the idea that mitochondria evolved from an ancestor related to Rickettsiales (an Order within Alphaproteobacteria that includes several host-associated pathogenic and endosymbiotic lineages), other studies suggest that mitochondria evolved from a free-living group. Authors [see attached publication] re-evaluated the phylogenetic placement of mitochondria. They used genome-resolved binning of oceanic meta-genome datasets and increased the genomic sampling of Alphaproteobacteria with twelve divergent clades, plus one clade representing a sister group to all Alphaproteobacteria. Subsequent phylogenomic analyses –– that specifically address long-branch attraction and compositional bias artifacts –– suggest that mitochondria did not evolve from Rickettsiales or any other currently recognized Alphaproteobacterial lineage. Rather, the analyses of these authors indicate that mitochondria evolved from a proteobacterial lineage that branched off before the divergence of all sampled Alphaproteobacteria. In light of this new finding, previous hypotheses about the nature of the mitochondrial ancestor will have to be re-evaluated. 🙂

Nature 3 May 2o18 557: 101–105

Posted in Center for Environmental Genetics | Comments Off on Evolutionary origin of mitochondria predates the origin of Alphaproteobacteria

The Post-GWAS Era: From Association to Function

After the discovery of the structure of DNA and the genetic code in the early 1950s, the field of human genetics was largely focused on understanding the structure and function of protein-coding genes and how rare mutations in these genes might be associated with causing disease or increasing risk of disease. Furthermore, the central dogma of molecular biology had decided that “genes are first transcribed into messenger RNA (mRNA), after which the mRNA is translated into protein.” Because of the straightforward nature of the genetic code –– it SEEMED easy to predict how alterations of the underlying DNA sequence would change the gene product (amino-acid sequence of the resulting protein). In addition, it was clear from Mendelian genetics that diseases “that run in families in predictable patterns” are caused by mutations in a single gene. Beginning with the mapping of the genetic cause (e.g. of sickle-cell anemia and the neurodegenerative disorder Huntington Disease), the causative mutations underlying many Mendelian diseases were elucidated by positional cloning, and an important hurdle had been accomplished in our understanding of the genetic bases of human disease.

However, many of the most common and (financially and emotionally) burdensome diseases –– such as cardiovascular disease, cancer, Alzheimer disease, Parkinsons disease, and type-2 diabetes –– are typically not (or never) caused by single mutations. Such ‘‘multifactorial traits’’ are instead influenced by a combination of multiple genetic, epigenetic, and environmental risk factors, and thus do not follow “simple” Mendelian inheritance patterns. The departure from a ‘‘one-gene, one-mutation, one-outcome’’ model posed a formidable challenge to elucidating the biology of these diseases. Multifactorial traits, by definition, are influenced by many genes (polygenic). Human height, for example, appears to be affected by genetic variation at hundreds if not thousands of loci across the genome. These genetic loci may interact in additive, or in non-additive (i.e., epistatic; gene-gene interactions), ways.

Yet, while it may not always be necessary to understand the cause of a disease in order to successfully treat it, such a mechanistic understanding certainly increases the likelihood that a successful therapeutic intervention will be achieved. The attached review summarizes what has happened since the first genome-wide association studies (GWAS) during the 2oo2-2oo6 era –– linking genetic variation to identify loci that harbor genetic variants [typically single-nucleotide variants (SNVs) or polymorphisms (SNPs)] that are associated with risk for complex diseases and quantitative traits. The earliest two GWAS that I can find include: the lymphotoxin-a gene (LTA) linked to myocardial infarction (2oo2) and the complement factor H gene (CFH) linked to age-related macular degeneration (2oo5). Today, the GWAS era has been successful in the sense that thousands of loci have been statistically significantly associated with risk for diseases and traits, and a notable number of these loci are well-replicated –– suggesting that they are true associations.

Several factors have made it difficult, however, to bridge the gap between the statistical associations linking locus-and-trait and a functional understanding of the biology underlying disease risk. First, the association of a DNA locus with disease does not specify which variant (or variants) at that locus is actually causing the association (the ‘‘causal variant’’) –– nor which gene (or genes) is affected by the causal variant (the ‘‘target gene’’). The former problem is due to the fact that there are often many co-inherited variants in strong linkage disequilibrium (LD; the non-random association of alleles at different loci along the same strand of DNA, same chromosome, in a given population) with the most significant (or ‘‘sentinel’’) disease-associated variant, comprising a haplotype. Within the haplotype, genetic variants in strong LD often have statistically indistinguishable associations with disease risk; as a consequence, empirical validation might be needed to determine which of the linked variants are functional. Second, more than 90% of disease-associated SNVs are located in non-protein-coding regions of the genome, and many of them are far away from the nearest known gene.

Am J Hum Genet 3 May 2o18; 102: 717–730

Posted in Center for Environmental Genetics | Comments Off on The Post-GWAS Era: From Association to Function

Valid Statistical Rationales for Sample Sizes

Valid Statistical Rationales for Sample Sizes

REGISTER NOW!

This webinar provides guidance on how to justify such sample sizes, and thereby indirectly provides guidance on how to choose sample sizes.

Date: Wednesday, June 6, 2018
Time: 10:00 AM PDT | 01:00 PM EDT
Duration: 90 Minutes
Instructor: John N. Zorich

Overview:

This webinar explains the logic behind sample-size choice for several statistical methods that are commonly used in verification or validation efforts, and how to express a valid statistical justification for a chosen sample size. Read more…

Who Will Benefit:

QA/QC Supervisor
Process Engineer
Manufacturing Engineer
QC/QC Technician
Manufacturing Technician
R&D Engineer

About Speaker:
John N. Zorich
Statistical Consultant & Trainer, Ohlone College & SV Polytechnic
John N. Zorich, has spent 35 years in the medical device manufacturing industry; the first 20 years were as a “regular” employee in the areas of R&D, Manufacturing, QA/QC, and Regulatory; the last 15 years were as consultant in the areas of QA/QC and Statistics… Read More

Compliance4All

161 Mission Falls Lane, Suite 216, Fremont, CA 94539, USA.

Toll Free: +1-800-447-9407

Fax: 302 288 6884
www.compliance4all.com | Unsubscribe

Posted in Center for Environmental Genetics | Comments Off on Valid Statistical Rationales for Sample Sizes

**HGNC Newsletter** Spring 2018

Some of you might be interested in the Human Genes Nomenclature Committee (HGNC) NewsLetter for Spring, 2o18.
An update on our VGNC project
Largely thanks to the work of our new dedicated Vertebrate Gene Nomenclature Committee (VGNC) curator (Tamsin), and our dedicated VGNC programmer (Beth), alongside input from the other HGNC curators –– our vertebrate gene-naming is rapidly expanding.
First, a little recap on how the project works: we select vertebrate species for gene-naming, based on the quality of their genomes and their relevance to the biomedical community. Initially, we use data from our HCOP tool to identify orthologs that are consistently predicted between human and the vertebrate species in question by four key resources: Panther, NCBI Gene, Ensembl Compara and OMA. For these ‘4 out of 4’ orthologs we automatically transfer the human gene nomenclature onto the other species –– provided they have passed a set of additional criteria. The predicted orthologs that fail to meet these criteria are then looked at by a curator, beginning with the ‘3 out of 4’ dataset where three of the four key orthology resources agree.
We currently have a total 54,783 approved symbols over all four of the current VGNC species. These figures break down per species as: chimpanzee –– 15,537; dog –– 13,622; cow –– 13,342; and horse –– 12,282. Many well studied genes now have approved symbols for their orthologs in all four species, for example: MTOR –– chimpanzee, dog, cow, horse; BRAF –– chimp, dog, cow, horse; and EGFR –– chimp, dog, cow, horse.
Large gene families, especially those where genes are found in clusters, need careful manual curation from the outset. Susan recently manually curated the keratin gene family ahead of presenting a poster at the International Plant & Animal (PAG) conference in January. She found that the two keratin gene clusters are broadly conserved across vertebrates and was able to name most of the keratin genes present on the chimp and cow genome assemblies. The keratin gene family includes unitary pseudogenes, where a one-to-one ortholog can be identified between a pseudogene in one species and a protein-coding gene in another vertebrate species; an example is the KRT89P pseudogene in human –– which is the ortholog of the functional cow KRT89 gene.
If you have an interest in a particular gene family, and would like to help us name the gene family members across vertebrates, please email vgnc@genenames.org.
Progress on replacing placeholder symbols
Renaming genes with placeholder symbols continues to be one of our current priorities. This is associated with the VGNC project mentioned above because human ‘C#orf’ gene symbols do not transfer well across species. For example, human C17orf64 (chromosome 17 open-reading frame 64) is currently approved as C17H17orf64 (chromosome 17 C17orf64 homolog) in chimp, C9H17orf64 (chromosome 9 C17orf64 homolog) in dog, C19H17orf64 (chromosome 19 C17orf64 homolog) in cow and C11H17orf64 (chromosome 11 C17orf64 homolog) in horse. Replacing placeholder symbols with symbols based on function, homology, or protein structure provides meaning and allows exact symbol transferral to other species. Some examples of updated placeholder symbols from the last few months are presented below:
Symbol changed from C4orf22 to CFAP299, cilia- and flagella-associated protein 299, now also approved as CFAP299 in chimp, dog and cow
Symbol changed from C11orf70 to CFAP300, cilia- and flagella-associated protein 300, now also approved as CFAP300 in chimp and dog
Symbol changed from C7orf49 to CYREN, cell cycle regulator of NHEJ, now also approved as CYREN in chimp, horse and cow
Symbol changed from C2orf71 to PCARE, photoreceptor cilium actin regulator, now also approved as PCARE in chimp, dog, cow and horse
In addition, the following are examples of our placeholder FAM# symbols, which we use to designate “family with sequence similarity #” of unknown function, that have recently been renamed thanks to publications describing functional data for the gene products:
Symbol changed from FAM212A to INKA1, inka-box actin regulator 1, now also approved as INKA1 in chimp, dog, cow, horse
Symbol changed from FAM212B to INKA2, inka-box actin regulator 2, now also approved as INKA2 in chimp, cow, horse
Symbol changed from FAM109A to PHETA1, PH domain-containing endocytic trafficking adaptor 1, now also approved as PHETA1 in chimp, cow and dog
Symbol changed from FAM109B to PHETA2, PH domain-containing endocytic trafficking adaptor 2, now also approved as PHETA2 in chimp, cow and dog
In cases where an approved symbol is missing for a particular species, (e.g. there is currently no CFAP299 symbol for horse) this is usually due to either a lack of consensus between orthology resource predictions or problems with mapping between NCBI Gene and Ensembl gene IDs for that particular species –– which may be due to differences in the automated gene model annotations.
New Gene Family pages
We are continually adding to our gene family resource. Gene families that were recently curated include:
LIMK/TESK kinase family
Ciliogenesis and planar polarity effector complex (CPLANE)
Myogenic regulatory family (MYF)
Myozenins (MYOZ)
GBAF complex and PBAF complex and BAF complex
Enolases (ENO)
Matrilins (MATN)
Junctophilins (JPH)
Interferon induced transmembrane proteins (IFITM)
PI4KA lipid kinase complex
GDNF family ligands
Neurotrophins (NTF)
Neuroligins (NRLGN)
Neurexins (NRXNs)
tRNA methyltransferases (TRMT)
DAZ RNA binding protein family (DAZ)

Spotlight on a new gene family:
Recent work to reassign the placeholder symbols FAM46A, FAM46B, FAM46C and FAM46D resulted in an in-depth consultation with the research community to attempt to unify the nomenclature for all of the non-canonical RNA polymerases. The new root symbol ‘TENT’ for ‘terminal nucleotidyltransferase’ was agreed upon for the FAM46 family and for the genes that were previously approved with the slightly less informative PAPD (“PAP associated domain containing”) root symbol.
The ‘TUT’ for ‘terminal uridylyl transferase’ genes were also standardised with TUT1 being retained and ZCCHC11 and ZCCHC6 being renamed with the symbols TUT4 and TUT7, which provide more information on gene function and were already in use in the literature. The ‘TUT’ genes were all assigned ‘TENT’ symbol aliases as ‘TUT’ is a more specific subclass of ‘TENT’. The gene symbol ‘MTPAP’ is widely used by the community –– therefore, this symbol was retained and the gene was given the alias ‘TENT6’. You can view the entire TENT family at Terminal nucleotidyltransferases.
Gene Symbols in the News
To begin this edition of ‘Gene Symbols in the News’, we have two news stories that highlight the personal impact of genomic medicine and both also feature a striking coincidence: In the first story, a scientist working on the FOX gene family discovered that her disabled daughter has a random mutation in one of her FOXG1 alleles. Dr Lee describes life caring for her daughter and how she is now studying Foxg1 in the brains of mice to eludicate how the mutated FOXG1 protein causes damage to the developing brain, even though there is also expression of a non-mutated allele.
The second story describes how the work of the ‘Deciphering Developental Disorders’ project provided a diagnosis for two families with daughters who have learning difficulties; these girls both carry a mutation in the CDK13 gene and, although there are just 11 children in the UK who have been identified with this particular mutation, the two girls featured in this article live just 20 minutes from one another. BBC journalists were present when the girls and their families met for the first time.
In other news, a study has suggested that a TRPM8 gene variant linked to incidence of migraine may have helped humans adapt to living in a cold climate; people descended from ancestors living in colder countries like Finland have a much higher incidence of this variant than those descended from ancestors from warmer climes.
An FGF21 gene variant has recently been linked to an increase in sugar and alcohol intake, a slight increase in blood pressure, a larger hip-to-waist ratio, but surprisingly overall lower levels of body fat.
The SFRP1 gene has been reported as a promising target for treating hair loss; the immunosuppresant Cyclosporine A had previously been shown to trigger hair growth at least in part via inhibition of SFRP1 but has many side effects. A recent study has identified an alternative drug, WAY-316606, that is potentially better at inhibiting SFRP1 activity and causes fewer side effects.
Finally, analysing gene activity in recently perished corpses has been shown to be valuable in helping to accurately determine time of death –– HBA1 is one such gene whose expression levels were found to be increased in 10 different tissues post-mortem.

http://www.genenames.org/hgncnews/hgnc-newsletter-spring-2018

————————————————————————–
If you have questions or comments on our newsletter or on any human gene nomenclature issue, please email us at: hgnc@genenames.org
—————————————————————————
HUGO Gene Nomenclature Committee (HGNC)
European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory Wellcome Trust Genome Campus Hinxton, Cambridgeshire
CB10 1SD, UK

Posted in Center for Environmental Genetics | Comments Off on **HGNC Newsletter** Spring 2018

The Rise of Zebrafish as a Model for Toxicology

The emergence of The Zebrafish (Danio rerio) Model for biomedical research began in the 1960s at the University of Oregon in Eugene, largely due to the pioneering work of George Streisinger and coworkers. They systematically determined the conditions necessary to adopt this new model (originally from small rivers in India) for the laboratory environment. The foundational techniques were also established for genetic manipulations and imaging that were critical to expand the imaginations of the next generation of scientists. For the next ~30 years, the vast majority of global zebrafish research efforts focused on understanding the processes of normal vertebrate embryonic development. The use of forward genetics [starting with a phenotype (trait), then proceed to identify the gene(s) that is/are responsible], and reverse genetics (first select a gene, disrupt that gene, and then observe the phenotype) –– coupled with careful in situ hybridization methodologies –– dominated the field. As authors of the two brief reviews [attached] describe, soon it soon became possible to elucidate vertebrate gene functions in the transparent early-life stage zebrafish at unprecedented economy and speed (in large part. due to zebrafish’s short generation-time).

A group (including Jonathan Knight and Monte Westerfield) was appointed at the 1994 Cold Spring Harbor “meeting on zebrafish genetics and development” to establish ZFIN, an online database of information (genetics, cellular, anatomical, physiological, and ultimately genomics similarities with mammals) –– that continues vibrantly today –– for zebrafish researchers. Intriguingly, it was discovered that much of this information could be extrapolated quite readily from the zebrafish genome to the human genome, and, hence, its value in discovering the cause of, and to develop treatments for, human diseases. The continued and accelerated acceptance of zebrafish has truly been amazing.

Toxicology, by definition, is a deeply applied science that is tightly constrained by the need to focus research efforts on disease prevention. Practical questions –– such as whether a specific chemical is safe, or a given disease is caused or influenced by a specific chemical exposure –– are at the core of the discipline. These questions are often difficult to answer with certainty, presenting a dilemma for individuals tasked with making policy decisions about Risk Assessment and chemical safety. By the mid-1990s, molecular toxicology and environmental genetics began moving the toxicologist away from predominantly describing dose-dependent endpoint relationships, to, instead, discovering chemical (and drug) targets and molecular pathways associated with the development of specific endpoints.

Today, the developmental characteristics of zebrafish are strategically being used by scientists to study topics –– ranging from high-throughput toxicity screens to toxicity in multi- and trans-generational studies. High-throughput technology today has obviously increased the utility of zebrafish embryonic-toxicity assays in screening of chemicals and drugs for toxicity, or downstream effects. In addition, advances in behavioral characterization and experimental methodology allow for observation of recognizable phenotypic changes after exposure to foreign chemicals including drugs. Future directions in zebrafish research are predicted to take advantage of CRISPR/Cas9 genome-editing methods –– in order to quickly create models of disease and interrogate modes of action and mechanisms of action with fluorescent reporters or tagged proteins. Zebrafish has many advantages as a toxicologic model. New methodologies and areas of study continue to expand the usefulness and application of the zebrafish.

Toxicol Sci May 2o18; 163: pp 3–4 and pp 5–12

Posted in Center for Environmental Genetics | Comments Off on The Rise of Zebrafish as a Model for Toxicology

A new understanding as to why mutation rates vary among species

MUTATIONS (i.e. alterations in DNA nucleotides in the haploid genome, base-pairs in the diploid genome) occur when cells copy their DNA incorrectly, or fail to repair damage from chemicals (endogenous or exogenous) or radiation (environmental effects). Some mistakes are beneficial, providing variation that enables organisms to adapt; this is one of the fundamental reasons why EVOLUTION has been occurring for ~3.8 billion years on Earth. However, some of these genetic mistakes can cause the mutation rate to rise –– thus fostering more mutations.

For many years, evolutionary biologists had assumed that mutation rates were identical among all species, and that these rates were SO consistent and predictable they could be used as “molecular clocks” to estimate divergence of one species from another –– as a function of time. Therefore, by counting the number of differences between the genomes of two species, evolutionary geneticists could date when they diverged. And it is well established that Africans, for example, as an older population on Earth, exhibit more mutations in their DNA than Caucasians. However, today, now that geneticists can compare whole genomes of parents and their offspring (by means of whole-genome sequencing [WGS] methodology), we can count the actual number of new mutations per generation.

WGS methodology has thus enabled researchers to measure mutation rates in about 40 species –– including newly reported numbers for orangutans, gorillas, and African green monkeys. All primates so far studied have mutation rates similar to that of humans. As recently learned from an evolution meeting, several labs have reported that bacteria, paramecia (single-celled freshwater protist animals having a slipper-like shape and covered with cilia), yeasts, and nematodes (round worms), all of which have much larger populations than humans, have mutation rates orders of magnitude lower than that in primates [see the 1-page editorial attached].

In the early years of Homo sapiens (modern human), humans originated in very small numbers (of dozens or hundreds). In large populations of any species (e.g. the bacterium, paramecium, yeast, nematode), natural selection can efficiently eradicate the harmful genes. In contrast, among smaller groups –– such as the earliest of humans –– undesirable genes can arise (including those that foster mutations). Support of this hypothesis now comes from data on a range of organisms that show an inverse relationship between mutation rate and ancient population size. This new knowledge offers insights into how cancers and other deleterious disases might develop. These fascinating new findings also have important implications for efforts of evolutionary biologists to use DNA to date branches on the Tree of Life.

Science 13 Apr 2o18; 360: 143 [single page]

Posted in Center for Environmental Genetics | Comments Off on A new understanding as to why mutation rates vary among species