The DisGeNET database is made available under the Open Database License. For more information, see the Legal Notices page.
DATA in tab separated files
The file contains gene-disease associations from UNIPROT, CTD (human subset), ClinVar, Orphanet, and the GWAS Catalog.
The file contains gene-disease associations obtained by text mining MEDLINE abstracts using the BeFree system.
The file contains all gene-disease associations in DisGeNET.
The file contains SNP-gene-disease associations obtained by text mining MEDLINE abstracts using the BeFree system.
The file contains All SNP-gene-disease associations in DisGeNET.
The file contains the publications supporting the gene-disease associations obtained by text mining MEDLINE abstracts (2015-2016) using the BeFree system.
RDF Linked Dataset
The directory contains the DisGeNET-RDF data dump and the VoID description files corresponding to DisGeNET version 4.0
Nanopublications Linked DatasetThe DisGeNET Nanopublications dataset v188.8.131.52 is the distribution of the DisGeNET v4.0 and is distributed in a unique file. The dump dataset is serialized in RDF/TriG format.
When querying DisGeNET for several diseases or genes at once, for lists longer than 20 entities, might take long. This is why we make available these scripts, that query the database in a faster way.
All scripts use 4 parameters:
- inputfile: name of the file with the genes or diseases to query. The file should contain a list of genes, or diseases, one in each line.
- resultsfile: name of the file to write output
- type of entity: gene or disease
- type of identifier:
- For Genes: entrez or hgnc
- For Diseases: cui, mesh or omim
It is important to place the input file and the script in the same folder.
The script uses libraries argparse and urllib2. You can download our examples of files by clicking on them.Example of usage:
- For Genes: python disgenet_python2.py hgncListToQuery.txt resultsFile.txt gene hgnc
- For Diseases: python disgenet_python2.py meshListToQuery.txt resultsFile.txt disease mesh
The script uses libraries argparse and urllibExample of usage:
- For Genes: python disgenet_python3.py geneidListToQuery.txt resultsFile.txt gene entrez
- For Diseases: python disgenet_python3.py omimListToQuery.txt resultsFile.txt disease omim
The script uses package LWPExample of usage:
- For Genes: perl disgenet.pl geneidListToQuery.txt resultsFile.txt genes entrez
- For Diseases: perl disgenet.pl omimListToQuery.txt resultsFile.txt disease omim
The script uses library RCurl . For some R versions there might be issues with the "rawToChar" function. If the scrip produces the following error "Error in rawToChar(getURLContent(url, readfunction = charToRaw(oql), upload = TRUE..." uncomment the line 84 and comment the line 85. Example of usage:
- For Genes: R --vanilla --quiet --silent --args hgncListToQuery.txt resultsFile.txt gene hgnc < disgenet.R
- For Diseases: R --vanilla --quiet --silent --args cuiListToQuery.txt resultsFile.txt disease cui < disgenet.R
The file contains the mappings of DisGeNET genes (Entrez Gene Identifiers) to UniProt entries
The file contains the mappings of DisGeNET UMLS CUIs to MesH identifiers. Notice that not every CUI in DisGeNET has a MeSH (less than 60%). Also, the correspondence is not always 1:1. The mappings were generated with the UMLS Metathesaurus (v 2015AB )