DisGeNET is a discovery platform containing one of the largest publicly available collections of genes and variants associated to human diseases (Piñero et al., 2016; Piñero et al., 2015). DisGeNET integrates data from expert curated repositories, GWAS catalogues, animal models and the scientific literature. DisGeNET data are homogeneously annotated with controlled vocabularies and community-driven ontologies. Additionally, several original metrics are provided to assist the prioritization of genotype–phenotype relationships.
The current version of DisGeNET (v6.0) contains 628,685 gene-disease associations (GDAs), between 17,549 genes and 24,166 diseases, disorders, traits, and clinical or abnormal human phenotypes, and 210,498 variant-disease associations (VDAs), between 117,337 variants and 10,358 diseases, traits, and phenotypes.
The information in DisGeNET can be accessed in several ways:
- The web interface, through the Search and Browse functionalities
- The Resource Description Framework (DisGeNET-RDF) representation via the SPARQL endpoint, and the Faceted Browser
- The DisGeNET Cytoscape App
- Scripts in the most commonly used programming languages
- The disgenet2r package. (Note to users: The disgenet2r package is still using the data from the version 5.0 of DisGeNET)
- The SQLite database
- Tab separated files. See downloads section
DisGeNET is a versatile platform that can be used for different research purposes including the investigation of the molecular underpinnings of human diseases and their comorbidities, the analysis of the properties of disease genes, the generation of hypothesis on drug therapeutic action and drug adverse effects, the validation of computationally predicted disease genes and the evaluation of text-mining methods performance.
The DisGeNET database is made available under the Attribution-NonCommercial-ShareAlike 4.0 International License. For more details, visit the Legal Notice page.
- 628,685 GDAs between 17,549 genes and 24,166 diseases,and traits
- 210,498 VDAs between 117,337 variants and 10,358 diseases and traits
- Improved web interface: new search and filter options
- New data sources: CGI, ClinGen, Genomics England panel app, and GWAS db
- New: inferred GDAs from HPO, GWAS catalog, and GWASdb
- New GDA attribute: the Evidence LeveL
- New gene attribute: the pLI (probability of being loss-of-function intolerant)
- New variant attributes: the allelic frequency in GNOMAD exomes and genomes
- Improved text mining system (BeFree) See more information here
ELIXIR has announced its first portfolio of Recommended Interoperability Resources (RIRs) to facilitate interoperability and reusability of life science data and support the principles of FAIR data management.
See the list of ELIXIR Recommended Interoperability Resources.
- 561,119 gene-disease associations (GDAs), between 17,074 genes and 20,370 diseases, disorders, traits, and clinical or abnormal human phenotypes
- 135,588 variant-disease associations (VDAs), between 83,002 SNPs and 9,169 diseases and phenotypes
- New data sources: PsyGeNET and the Human Phenotype Ontology
- New: Two DisGeNET Scores are now available: one for GDAs, and one for VDAs
- New: A Disease Specificity Index (DSI) and a Disease Pleiotropy Index (DPI) have been computed for the variants
- New: We have added the Evidence Index for GDAs and VDAs.