HGNC Newsletter Spring 2017
There are currently 40,979 approved symbols
New collaboration with TGMI
The HGNC is excited to announce that we are now members of the Wellcome Trust-funded Transforming Genetic Medicine Initiative (TGMI). This project strives to facilitate the interpretation and reporting of genetic variation in clinical medicine. At the genomics level this includes identifying the full set of human protein coding genes that can cause disease when mutated, providing manual curation of gene-to-disease association, and developing a consistent clinical annotation system for the reporting of genetic variation.
Standardisation is understandably a key theme for TGMI and that’s where the HGNC comes in, because TGMI will use our approved gene symbols and HGNC IDs, and is already providing us with feedback on clinician experiences and understanding of our nomenclature. Elspeth is a member of the core TGMI team and you can read an interview with her that appeared as one of the TGMI regular Friday blog posts.
Stabilising gene symbols originally created for phenotypes
As part of our work with TGMI, we are also looking to minimise changes to gene symbols and names for disease associated genes in the future, to reduce the problems that changes can cause for the clinical community. We are therefore reviewing gene nomenclature that is no longer widely used, so that we can make any necessary updates now in order to provide stabilisation in the long term. Part of this process has involved reviewing genes that were originally named based on a human phenotypes; this also ties in perfectly with our ongoing review of human-centric nomenclature ahead of transferring gene symbols and names across vertebrate species.
In some cases researchers working on the genes in question have agreed we should update both the gene symbol and gene name; these changes will hopefully avoid future confusion about whether anyone is referring to the phenotype or the causative gene, and will definitely please our collaborators at OMIM. For example, CECR1 for “cat eye syndrome chromosome region, candidate 1” has been updated to ADA2 for “adenosine deaminase-2”; GLTSCR1 for “glioma tumor suppressor candidate region gene 1” is now BICRA for “BRD4-interacting chromatin-remodeling-complex-associated protein”; and DYX1C1 for “dyslexia susceptibility 1 candidate 1” is now DNAAF4 for “dynein axonemal assembly factor 4”.
In other cases, we have retained the gene symbol due to very well supported usage in the literature (and resistance to change from researchers), but we have updated the gene name to remove the reference to phenotype and to include information describing the structure or function of the encoded gene product(s). Examples are AUTS2 whose name has been changed from “autism susceptibility candidate 2” to “AUTS2, activator of transcription and developmental regulator”; OVCA2 which now has the gene name “OVCA2, serine hydrolase domain-containing” in place of “ovarian tumor suppressor candidate 2”; CLN8 which has been changed from “ceroid-lipofuscinosis, neuronal 8” to “CLN8, transmembrane ER and ERGIC protein”; and RP1 whose name is now “RP1, axonemal microtubule-associated” rather than “retinitis pigmentosa 1 (autosomal dominant)”.
The trouble with “date” symbols
Feedback from the TGMI project has also prompted us to take action to tackle a longstanding problem –– the autoconversion of certain gene symbols into dates in Microsoft Excel spreadsheets. Last year, following the publication of an article in Genome Biology by Zeeman et al., the BBC website ran an online article discussing the claim that as many as a fifth of papers with associated Excel files featured gene symbols converted to dates. In the past we have tried to publicise how to change settings in Excel to avoid the issue but it is infeasible for us to reach all biologists and clinicians using Excel.
The TGMI has emphasised the huge scope of this problem within a clinical setting and warned that mistakes are inevitable as a result. Following discussions with our Scientific Advisory Board we have decided that we will aim to change these symbols so that they are no longer converted into dates. The affected genes are currently approved using the root symbols MARCH (membrane associated ring-CH-type finger), MARC (mitochondrial amidoxime reducing component) and SEPT (septins). Our initial consultation with the research community publishing on these genes had very mixed results, but we plan to work in collaboration with the TGMI to explain the far-reaching consequences of these symbol-to-date conversions and garner further support for the proposed changes.
Reciprocal links between RNAcentral and genenames.org
We are delighted to announce that RNAcentral is using HGNC data as their source of human non-coding RNAs; RNAcentral link to genenames.org using the database accessions that we manually curate into our records, and HGNC gene symbols names are now being displayed within RNAcentral sequence records. For more information on how RNAcentral have mapped in our gene names, read their blog. We have also reciprocally added RNAcentral IDs to the ‘nucleotide sequences’ section of non-coding RNA Symbol Reports where applicable. For example, see the URS000075AC83 ID on the ZEB1-AS1 Symbol Report. Clicking on the adjacent RNAcentral link takes you through to the RNAcentral sequence record. All RNAcentral IDs that have been mapped to HGNC IDs are fully searchable on genenames.org.
New Gene Family pages
We are still busy adding new gene family pages in the course of our usual curatorial activities. Recent families include Glutamine amidotransferase-like class-1-domain-containing, Heparin-binding growth factor family, and Seven-beta-strand methyltransferase motif-containing. We have also made a couple of new families with large hierarchical structures –– the Heterotrimeric G proteins family has the subsets G protein subunits beta, G protein subunits gamma, and G protein subunits alpha which is further subdivided into G protein subunits alpha, group i, G protein subunits alpha, group q, G protein subunits alpha, group s and G protein subunits alpha, group 12/13, whereas the Transposable element-derived genes are the highest level in a hierarchy that includes DNA transposon-derived genes, Envelope ERV derived genes and Gag like LTR retrotransposon derived genes.
Gene Symbols in the News
What kind of sleeper are you? Recent studies have linked a variant in the FABP7 gene to disrupted sleep patterns and a CRY1 gene variant to finding it difficult to both fall asleep at night and wake up the next morning. If you have a sweet tooth according to a recent study this could mean that you are carrying a particular variant of FGF21, the gene product of which is secreted by the liver.
There is good news for carriers of a particular ADAMTS7 variant due to its protective association with heart disease, but the news is not so good for carriers that smoke for whom the protective effects are decreased.
Finally, there has been a new study linking genetics to depression –– those with a particular form of the NKPD1 gene show a greater risk of suffering from the disorder.