Knowledge Base

LSU-HEALTH GENOMICS KNOWLEDGE BASE (LSU-HEALH-GKB)

LSU-Health-GKB: The LSU-Health Genomics Knowledge Base is an online, continuously updated, searchable database of genetic variants and published scientific literature, BIG resources, and other materials that focusses on population genomic discoveries to understand the molecular basis of health disparities. GKB was developed by the BIG Program with specific focus on cancer and other common human diseases. The database is curated by BIG staff and is regularly updated to reflect ongoing developments in the field of genetics and population genomics. This site provides a compendium of continuously updated databases that can be searched for population genomics-related information on cancer and other common human diseases. We will continue adding more information as it becomes available, so we are interested in your feedback via email. In addition, the site is linked to other knowledge bases. Below is a list of those resources:

The CDC Public Health Genomics Knowledge Base: The CDC Public Health Genomics Knowledge Base is an online, continuously updated, searchable database of published scientific literature, CDC resources, and other materials that address the translation of genomic discoveries into improved health care and disease prevention. The Knowledge Base, cosponsored by the Division of Cancer Control and Population Sciences at the National Cancer Institute, is curated by CDC staff and is regularly updated to reflect ongoing developments in the field. This compendium of databases can be searched for genomics-related information on any specific topic. We will continue to add additional features to the knowledge base and are interested in your feedback via email. Read more at: https://phgkb.cdc.gov/PHGKB/phgHome.action?action=about

CliniVar NCBI: ClinVar aggregates information about genomic variation and its relationship to human health. ClinVar is a freely accessible, public archive of reports of the relationships among human variations and phenotypes, with supporting evidence. ClinVar thus facilitates access to and communication about the relationships asserted between human variation and observed health status, and the history of that interpretation. ClinVar processes submissions reporting variants found in patient samples, assertions made regarding their clinical significance, information about the submitter, and other supporting data. The alleles described in submissions are mapped to reference sequences, and reported according to the HGVS standard. ClinVar then presents the data for interactive users as well as those wishing to use ClinVar in daily workflows and other local applications. ClinVar works in collaboration with interested organizations to meet the needs of the medical genetics community as efficiently and effectively as possible. Read more about using ClinVar: Read more: https://www.ncbi.nlm.nih.gov/clinvar/intro/

COSMIC v84, released 13-FEB-18. COSMIC, the Catalogue Of Somatic Mutations In Cancer, is the world's largest and most comprehensive resource for exploring the impact of somatic mutations in human cancer. Read more:

http://cancer.sanger.ac.uk/cosmic

Genetic Testing Registry (GTR): The Genetic Testing Registry (GTR®) provides a central location for voluntary submission of genetic test information by providers. The scope includes the test's purpose, methodology, validity, evidence of the test's usefulness, and laboratory contacts and credentials. The overarching goal of the GTR is to advance the public health and research into the genetic basis of health and disease: Read more: https://www.ncbi.nlm.nih.gov/gtr/

MedGen NCBI: MedGen organizes information related to human medical genetics, such as attributes of conditions with a genetic contribution. MedGen is NCBI's portal to information about human disorders and other phenotypes having a genetic component. MedGen is structured to serve health care professionals, the medical genetics community, and other interested parties by providing centralized access to diverse types of content. For example, because MedGen aggregates the plethora of terms used for particular disorders into a specific concept, it provides a Rosetta stone for stakeholders who may use different names for the same disorder. Maintaining a clearly defined set of concepts and terms for phenotypes is essential to support efforts to characterize genetic variation by its effects on specific phenotypes. The assignment of identifiers for those concepts allows computational access to phenotypic information, an essential requirement for the large-scale analysis of genomic data. Read more at:

https://www.ncbi.nlm.nih.gov/medgen/

GeneReview: GeneReviews, is an international point-of-care resource for busy clinicians, provides clinically relevant and medically actionable information for inherited conditions in a standardized journal-style format, covering diagnosis, management, and genetic counseling for patients and their families. Each chapter in GeneReviews is written by one or more experts on the specific condition or disease and goes through a rigorous editing and peer review process before being published online. GeneReviews currently comprises 705 chapters. Read more at: https://www.ncbi.nlm.nih.gov/books/NBK1116/

RefSeqGene NCBI: RefSeqGene is a database that defines genomic sequences to be used as reference standards for well-characterized genes and is part of the Locus Reference Genomic (LRG) Project. RefSeqGene, it is a subset of NCBI's Reference Sequence (RefSeq) project, defines genomic sequences to be used as reference standards for well-characterized genes. These sequences, labeled with the keyword RefSeqGene in NCBI's nucleotide database, serve as a stable foundation for reporting mutations, for establishing conventions for numbering exons and introns, and for defining the coordinates of other variations. RefSeq mRNA and protein sequences have long been used for this purpose, but have the obvious weakness of not providing explicit coordinates for flanking or intronic sequence. RefSeq chromosome sequences do provide explicit coordinates no matter the relationship to any gene annotation, but have awkwardly large coordinate values that will change when the sequence is updated because of a re-assembly. Sequences of the RefSeqGene project counter both of these drawbacks by providing more stable gene-specific genomic sequence for each gene, as well as including upstream and downstream flanking regions. If modifications must be made to any RefSeqGene sequence, it will be versioned and tools will be provided to facilitate conversion of coordinates. The RefSeqGene sequences are aligned to reference chromosomes, and current and previous chromosome coordinates are available because of that re-alignment. The Clinical Remap tool makes that conversion easy. Read more:

https://www.ncbi.nlm.nih.gov/refseq/rsg/

Locus Reference Genomics (LRG): This database was created to serve as reference. In recognition of the need to create universally accepted reference standards for variant reporting, GEN2PHEN (http://www.gen2phen.org) sponsored a meeting in 2008 with key stakeholders, including EBI, NCBI, HGVS, LSDB curators and other members of the community. The goal of the meeting was to design a reference system that would address the shortcomings of existing systems, including confusion over versioning, and that would allow consistent and unambiguous reporting of variants in clinically relevant loci. The new system, founded on the RefSeqGene project, was named Locus Reference Genomic (LRG). As of October 2013, over 700 LRGs have been created, of which over 400 are public and in use by the community (http://www.lrg-sequence.org/LRG). The aim of the project is to create an LRG for every locus with clinical implications. An LRG is a manually curated record that contains stable and thus, un-versioned reference sequences designed specifically for reporting sequence variants with clinical implications. Each LRG contains a stable “fixed” section and a regularly updated “updatable” section. The fixed section contains stable genomic DNA sequence for the region of interest, transcripts and proteins deemed essential for reporting variants, and an LRG-specific exon numbering system. The updatable section contains the most recent biological information for each LRG region, including mapping information, annotation of all transcripts and overlapping genes in the region, and legacy exon and amino acid numbering systems. The sequences of each LRG are chosen in collaboration with research and diagnostic laboratories, LSDB (locus specific database) curators and mutation consortia with expertise in the region of interest. Additional information on collaborators can be found here http://lrg-sequence.org/lrg-collaborators.Read more at: http://www.lrg-sequence.org/about

Pharmacogenomic Knowledge Base (PharmGKB):This is Pharmacogenomics database developed by Stanford University and the NIH. PharmGKB annotates PGx-based drug dosing guidelines published by the Clinical Pharmacogenetics Implementation Consortium (CPIC), the Royal Dutch Association for the Advancement of Pharmacy - Pharmacogenetics Working Group (DPWG), the Canadian Pharmacogenomics Network for Drug Safety (CPNDS) and other professional societies. PharmGKB annotations present a brief summary of the genotype-based dosing recommendations. Read more here: https://www.pharmgkb.org/

Variation Reporter: version 1.4.1.9. NCBI Variation Reporter is a tool for accessing the content of human variation resources at NCBI. You may query our data using your variant calls in a variety of formats. We will match them to our data to produce a report that draws on dbSNP, dbVar, ClinVar, and NCBI's own human genomic annotation. Upload your data. The tool accepts input data in the form of variant definitions to query our databases. The data you upload is not a submission, it is not tracked or archived. You can type/paste your variant calls directly into a text box, or upload them in VCF, HGVS, GVF or BED format files. To help interpret your data, you must choose an assembly (especially important if you identify chromosome 1 as 'chr1' or just '1') and the genome reference GRCh37.p13 (hg19) or GRCh38.p2 (hg38).

https://www.ncbi.nlm.nih.gov/variation/tools/reporter

VARIATION VIEWER: Is a tool for interactive examination and download of nucleotide variants for a specific locus Variation Viewer allows you to view, search, and navigate variations in genomic context. You can review data from dbSNP, dbVar and ClinVar, or upload your own data. You can search based on chromosomal location, gene, variant IDs from dbSNP and dbVar, or phenotype; and review results both as sequence annotation tracks and in a filterable table. Representative uses include finding: Variants in the region of a gene; Pathological variations in a region on GRCh38; Copy number variations at 15q11.1 on GRCh37/hg19 For a quick start to using Variation Viewer effectively, please watch the introductory video tutorial. There you will learn how to select the assembly, focus on your region of interest, navigate by exons, review HGVS expressions, filter results, link to other databases, and download content of interest. A more detailed version of this documentation, suitable for handouts, is available as the Variation Viewer Factsheet. Related tools and sites: Variation Viewer Help: In-depth information on what you can do, how to do it, and how to create links. Variation home page: Portal to databases and tools designed to display, access, or analyze data about variation. Read more at: https://www.ncbi.nlm.nih.gov/variation/view/?q=CFH

NCBI Genome Remapping Service : NCBI Remap is a tool that allows users to project annotation data from one coordinate system to another. This remapping (sometimes called 'liftover') uses genomic alignments to project features from one sequence to the other. For each feature on the source sequence, we perform a base-by-base analysis of each feature on the source sequence in order to project the feature through the alignment to the new sequence. We there are three variations of Remap. Assembly-Assembly allows the remapping of features from one assembly to another. Clinical allows for the remapping of features from assembly sequences to RefSeqGene sequences (including transcript and protein sequences annotated on the RefSeqGene) or from RefSeqGene sequences to an assembly. Alt loci remap allows for the mapping of features between the Primary assembly unit and the Alternate Loci and Patches assembly units available for GRC assemblies. You can view a short video describing how to use remap here: http://www.youtube.com/watch?v=0lhcMGGReVQ read more at:https://www.ncbi.nlm.nih.gov/genome/tools/remap#tab=rsg

Welcome to PheGenI: The Phenotype-Genotype Integrator (PheGenI), merges NHGRI genome-wide association study (GWAS) catalog data with several databases housed at the National Center for Biotechnology Information (NCBI), including Gene, dbGaP, OMIM, eQTL and dbSNP. This phenotype-oriented resource, intended for clinicians and epidemiologists interested in following up results from GWAS, can facilitate prioritization of variants to follow up, study design considerations, and generation of biological hypotheses. Users can search based on chromosomal location, gene, SNP, or phenotype and view and download results including annotated tables of SNPs, genes and association results, a dynamic genomic sequence viewer, and gene expression data. PheGenI is still under active development. Currently, the phenotype search terms are based on MeSH and will be enhanced with additional options in the future. Download all association results or enter search criteria below. View our tutorial on the NCBI YouTube channel. Read more at: https://www.ncbi.nlm.nih.gov/gap/phegeni

1000 Genomes Browser: The 1000 Genomes Project ran between 2008 and 2015, creating the largest public catalogue of human variation and genotype data. As the project ended, the Data Coordination Centre at EMBL-EBI has received continued funding from the Wellcome Trust to maintain and expand the resource. Read more at:

https://www.ncbi.nlm.nih.gov/variation/tools/1000genomes/

dbVar: dbVar is NCBI's database of human genomic structural variation — insertions, deletions, duplications, inversions, mobile elements, and translocations. Structural variation (SV) is generally defined as a region of DNA approximately 1 kb and larger in size and can include inversions and balanced translocations or genomic imbalances (insertions and deletions), commonly referred to as copy number variants (CNVs). These CNVs often overlap with segmental duplications, regions of DNA >1 kb present more than once in the genome, copies of which are >90% identical. If present at >1% in a population a CNV may be referred to as copy number polymorphism (CNP). In 1991, Charcot-Marie Tooth (CMT) disease was the first autosomal dominant disease associated with a gene dosage effect due to an inherited DNA rearrangement. Most cases of CMT1A are associated with a 1.5-Mb tandem duplication in 17p11.2-p12, mediated by flanking segmental duplications, that encompasses the PMP22 gene. The disease phenotype results from having three copies of the normal gene. The reciprocal product of the recombination, a single copy of the PMP22 gene, results in the clinically distinct hereditary neuropathy with liability to pressure palsies (HNPP). Read more at:

https://www.ncbi.nlm.nih.gov/dbvar/content/overview/:

GWAS Catalog: The Catalog was founded by the NHGRI in 2008, in response to the rapid increase in the number of published genome-wide association studies (GWAS). These studies provide an unprecedented opportunity to investigate the impact of common variants on complex disease; however identifying published GWAS can be challenging, and the vast wealth of data contained within these publications is effectively inaccessible to researchers without systematic cataloguing and summarization of the observed associations. The GWAS Catalog provides a consistent, searchable, visualisable and freely available database of published SNP-trait associations, which can be easily integrated with other resources, and is accessed by scientists, clinicians and other users worldwide. Within the Catalog, all eligible GWA studies are identified by literature search and assessed by our curators, who then extract the reported trait, significant SNP-trait associations, and sample metadata. We aim to curate eligible studies within 1-2 months of publication, dependent on the availability of literature, and the data is released on a weekly cycle. The Catalog also publishes the iconic GWAS diagram of SNP-trait associations, mapped onto the human genome by chromosomal location and displayed on the human karyotype. Since 2010, delivery and development of the Catalog has been a collaborative project between the EMBL-EBI and NHGRI. The Catalog website and infrastructure is now hosted by EMBL-EBI. A team of curators, all of whom are experienced molecular biologists, curate the data, and provide user support. The curation team is supported by dedicated software developers, and is lead by Fiona Cunningham, Paul Flicek & Helen Parkinson, and Lucia Hindorff at the NHGRI. We work closely with the SPOT and Ensembl teams at the EBI, and receive input from an independent Scientific Advisory Board. For more details about the Catalog curation process and data extraction procedures, please refer to the Methods page. Read more at: https://www.ebi.ac.uk/gwas/

dbSNP: Database of single nucleotide polymorphisms (SNPs) and multiple small-scale variations that include insertions/deletions, microsatellites, and non-polymorphic variants. The Single Nucleotide Polymorphism database (dbSNP) is a public-domain archive for a broad collection of simple genetic polymorphisms. This collection of polymorphisms includes single-base nucleotide substitutions (also known as single nucleotide polymorphisms or SNPs), small-scale multi-base deletions or insertions (also called deletion insertion polymorphisms or DIPs), and retroposable element insertions and microsatellite repeat variations (also called short tandem repeats or STRs). Please note that in this chapter, you can substitute any class of variation for the term SNP. Each dbSNP entry includes the sequence context of the polymorphism (i.e., the surrounding sequence), the occurrence frequency of the polymorphism (by population or individual), and the experimental method(s), protocols, and conditions used to assay the variation. dbSNP accepts submissions for variations in any species and from any part of a genome. This document will provide you with options for finding SNPs in dbSNP, discuss dbSNP content and organization, and furnish instructions to help you create your own (local) copy of dbSNP. Read more at: https://www.ncbi.nlm.nih.gov/snp

dbGaP: The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the data and results from studies that have investigated the interaction of genotype and phenotype in Humans. Read more at: https://www.ncbi.nlm.nih.gov/gap

OMIM NCBI: OMIM Is an Online Mendelian Inheritance in Man. An Online Catalog of Human Genes and Genetic Disorders. Read more at: https://www.ncbi.nlm.nih.gov/omim

Gene NCBI: Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes, and links to genome-, phenotype-, and locus-specific resources worldwide.

https://www.ncbi.nlm.nih.gov/gene/

IMPORTANT NOTE: BIG does not independently verify information submitted to the these databases; it relies on submitters to provide information that is accurate and not misleading. BIG makes no endorsements of tests or laboratories listed in these databases. BIG is not a substitute for medical advice. Patients and consumers with specific questions about a genetic test should contact a health care provider or a genetics professional.

School of Medicine

Bioinformatics and Computational Medicine Training Program

LSU-HEALTH GENOMICS KNOWLEDGE BASE (LSU-HEALH-GKB)