Skip navigation

About Us

Contents

  1. Overview
  2. Support
  3. Data Categories

Top ↑ Overview

CTD is a robust, publicly available database that aims to advance understanding about how environmental exposures affect human health. It provides manually curated information about chemical–gene/protein interactions, chemical–disease and gene–disease relationships. These data are integrated with functional and pathway data to aid in development of hypotheses about the mechanisms underlying environmentally influenced diseases.

We also have additional ongoing projects involving manual curation of exposome data and chemical–phenotype relationships to help identify pre–disease biomarkers resulting from environmental exposures.

This year, CTD turned 10! We’re grateful to our strong community support and encourage you to give us feedback so we can continue to evolve with your research needs.

Top ↑ Support

This program is supported by funds from the National Institute of Environmental Health Sciences (NIEHS):

We’re also proud to be part of the NIEHS Environmental Health Science Center at NC State, the Center for Human Health and the Environment (P30ES025128).

Top ↑ Data Categories

Chemicals
CTD integrates a chemical subset of the Medical Subject Headings (MeSH®), the hierarchical vocabulary from the U.S. National Library of Medicine. You can view diverse information about chemicals, including chemical structures, curated interacting genes and proteins, curated and inferred disease relationships, and enriched pathways and functional annotations. You can browse chemicals, or use them to formulate gene, chemical–gene interaction, or reference queries.
Diseases
CTD's MEDIC disease vocabulary is a modified subset of descriptors from the “Diseases” category of the U.S. National Library of Medicine (NLM) Medical Subject Headings (MeSH®), combined with genetic disorders from the Online Mendelian Inheritance in Man® (OMIM®) database. CTD biocurators mapped OMIM diseases to terms within the hierarchical MeSH disease vocabulary to expand our disease representation. This combined vocabulary is used to curate gene–disease and chemical–disease associations. You can browse diseases, or use them to formulate gene or reference queries.
Genes
The CTD cross-species gene vocabulary (symbols, names, and synonyms) is derived from the Gene database at the National Center for Biotechnology Information (NCBI), a division of the U.S. National Library of Medicine. You can view diverse information about genes, including curated interacting chemicals, curated and inferred disease relationships, and associated pathways and functional annotations. You can browse genes, or access them using the Keyword search or by formulating advanced queries.
Chemical–Gene/Protein Interactions
To improve understanding about the mechanisms of chemical actions, we manually curate chemical–gene and –protein interactions in vertebrates and invertebrates from the published literature. These interactions are both direct (e.g., “chemical binds to protein”) and indirect (e.g., “chemical results in increased phosphorylation of a protein” via intermediate events).
We curate interactions using a hierarchical interaction-type vocabulary that characterizes common physical, regulatory, and biochemical interactions between chemicals and genes or proteins. This vocabulary comprises 70 terms including actions (e.g., “binds to”, “imports”), operators that describe the degree of a chemical's effect (e.g., “increases”), and qualifiers that specify the form of the gene or chemical involved in an interaction (e.g., “protein” or “chemical metabolite,” respectively).
You can search chemical–gene interactions directly via the chemical–gene interaction query, or access them via a gene, chemical, disease, or reference.
Gene–Disease Associations
Gene-disease associations may be inferred via curated chemical-gene and chemical-disease associations. CTD contains curated and inferred gene–disease associations. Curated gene–disease associations are extracted from the published literature by CTD biocurators, or are derived from the OMIM database using the mim2gene file from the NCBI Gene database. Inferred associations (see figure) are established via CTD–curated chemical–gene interactions (e.g., gene A is associated with disease B because gene A has a curated interaction with chemical C, and chemical C has a curated association with disease B). Curated and inferred associations are identified, and help users develop hypotheses about mechanisms underlying environmental diseases.
Inference scores are calculated for all inferred relationships. These scores reflect the degree of similarity between CTD chemical–gene–disease networks and a similar scale-free random network. The higher the score, the more likely the inference network has atypical connectivity. Many biological networks, such as disease and metabolic networks, have been shown to be scale-free random networks.[4] The inference score is calculated as the log-transformed product of two common-neighbor statistics used to assess the functional relationships between proteins in a protein–protein interaction network.[5] The first statistic takes into account the connectivity of the chemical and disease along with the number of genes used to make the inference. The second statistic takes into the account the connectivity of each of the genes used to make the inference.
Chemical–Disease Associations
Chemical-disease associations may be inferred via curated chemical-gene and gene-disease associations. CTD contains curated and inferred chemical–disease associations. Curated chemical–disease associations are extracted from the published literature by CTD biocurators. Inferred associations (see figure) are established via CTD–curated chemical–gene interactions (e.g., chemical A is associated with disease B because chemical A has a curated interaction with gene C, and gene C has a curated association with disease B). Curated and inferred associations are identified, and help users develop hypotheses about mechanisms underlying environmental diseases.
Gene–Gene Interactions
CTD represents gene–gene interactions from BioGRID[6] that consist of genetic and protein interactions curated from primary literature for all major model organisms by BioGRID curators. These interactions are available for each gene and reference, and for the inference networks underlying each chemical–disease association. In addition, you can generate pathways for custom collections of genes using the Set Analyzer tool.
References
CTD contains reference articles related to toxicologically significant vertebrate and invertebrate genes, diseases, and associated chemicals. References were identified by information retrieval methods, and comprise a subset of MEDLINE ®/PubMed®, a database of the U.S. National Library of Medicine.
Organisms
CTD's hierarchical organism vocabulary consists of the Eumetazoa (vertebrates and invertebrates) branch of the Taxonomy Database from the National Center for Biotechnology Information (NCBI), a division of the U.S. National Library of Medicine. You can browse organisms, or use them to formulate gene, interaction, or reference queries.
Gene Ontology
Gene Ontology (GO) annotations are integrated with gene data in CTD. In addition, GO terms that are statistically enriched among genes/proteins that interact with a chemical are displayed for each chemical. You can browse GO and use it to formulate gene and interaction queries.
Pathways
KEGG and REACTOME pathway data describe known molecular interaction and reaction networks. These data are integrated with chemicals, genes, and diseases in CTD to provide insights into molecular networks that may be affected by chemicals, and possible mechanisms underlying environmental diseases. You can browse pathways, or use them to formulate gene or chemical–gene interaction queries. Pathway information is provided for chemical, gene, and disease detail pages. Pathways that are statistically enriched among genes/proteins that interact with a chemical are displayed for each chemical.
Exposures
CTD is working to enhance the capacity to identify environment–disease connections by developing an Exposure Ontology (ExO) that will be used to curate and present exposure data.