Important notice

The DisGeNET database is made available under the Open Database License. For more information, see the Legal Notices page.

DATA in tab separated files

1. Curated gene-disease associations

The file contains gene-disease associations from UNIPROT, CTD (human subset), ClinVar, Orphanet, and the GWAS Catalog.

2. BeFree gene-disease associations

The file contains gene-disease associations obtained by text mining MEDLINE abstracts using the BeFree system.

3. ALL gene-disease associations

The file contains all gene-disease associations in DisGeNET.

4. BeFree SNP-gene-disease associations

The file contains SNP-gene-disease associations obtained by text mining MEDLINE abstracts using the BeFree system.

5. All SNP-gene-disease associations

The file contains All SNP-gene-disease associations in DisGeNET.

6.Publications with gene-disease associations from BeFree System

The file contains the publications supporting the gene-disease associations obtained by text mining MEDLINE abstracts (2015-2016) using the BeFree system.

7. README file

RDF Linked Dataset

1. RDF Downloads

The directory contains the DisGeNET-RDF data dump and the VoID description files corresponding to DisGeNET version 4.0

Nanopublications Linked Dataset

1.DisGeNET Nanopublications dataset v4.0.0.0

The DisGeNET Nanopublications dataset v4.0.0.0 is the distribution of the DisGeNET v4.0 and is distributed in a unique file. The dump dataset is serialized in RDF/TriG format.

Scripts

When querying DisGeNET for several diseases or genes at once, for lists longer than 20 entities, might take long. This is why we make available these scripts, that query the database in a faster way.

All scripts use 4 parameters:

  1. inputfile: name of the file with the genes or diseases to query. The file should contain a list of genes, or diseases, one in each line.
  2. resultsfile: name of the file to write output
  3. type of entity: gene or disease
  4. type of identifier:
    • For Genes: entrez or hgnc
    • For Diseases: cui, mesh or omim

It is important to place the input file and the script in the same folder.

1. In Python 2

The script uses libraries argparse and urllib2. You can download our examples of files by clicking on them.

Example of usage:

2. In Python 3

The script uses libraries argparse and urllib

Example of usage:

3. In Perl

The script uses package LWP

Example of usage:

4. In R

The script uses library RCurl . For some R versions there might be issues with the "rawToChar" function. If the scrip produces the following error "Error in rawToChar(getURLContent(url, readfunction = charToRaw(oql), upload = TRUE..." uncomment the line 84 and comment the line 85. Example of usage:

  • For Genes:
  • R --vanilla --quiet --silent --args hgncListToQuery.txt resultsFile.txt gene hgnc < disgenet.R
  • For Diseases:
  • R --vanilla --quiet --silent --args cuiListToQuery.txt resultsFile.txt disease cui < disgenet.R

Mappings

1. UniProt Downloads

The file contains the mappings of DisGeNET genes (Entrez Gene Identifiers) to UniProt entries

2. UMLS CUI to MeSH Identifier

The file contains the mappings of DisGeNET UMLS CUIs to MesH identifiers. Notice that not every CUI in DisGeNET has a MeSH (less than 60%). Also, the correspondence is not always 1:1. The mappings were generated with the UMLS Metathesaurus (v 2015AB )