The disgenet2r package

The disgenet2r package contains a set of functions to retrieve, visualize and expand DisGeNET data. The functions in DisGeNET allow filtering the information using several metrics in DisGeNET (score range, database source of the data). The package offers different types of plots (Heatmaps, venn diagrams, networks) to visualize the data.

Installation

The package disgenet2r is available through Bitbucket. The package requires an R version > 3.5. Additionally, the following packages are needed: VennDiagram, stringr, tidyr, SPARQL, RCurl, igraph, ggplot2, and reshape2.

Install disgenet2r by typing in R:

library(devtools)

install_bitbucket("ibi_group/disgenet2r")

To load the package:

library(disgenet2r)

Retrieving GDAs

To retrieve the diseases associated to a list of genes, use the following function:

results <- gene2disease( gene = c( "KCNE1", "KCNE2", "KCNH1", "KCNH2", "KCNG1"), verbose = TRUE)

To retrieve the genes associated to a list of diseases, use the following function:

results <- disease2gene( disease = c("C0036341", "C0002395", "C0030567","C0005586"), database = "CURATED", verbose = TRUE )

Retrieving VDAs

To retrieve the diseases associated to a list of variants, use the following function:

results <- variant2disease( variant= "rs121913279", database = "CURATED")

To retrieve the variants associated to a list of diseases, use the following function:

results <- disease2variant disease = c("C3150943", "C1859062", "C2678485", "C4015695"), database = "CURATED", score = c(0.75, 1) )

Performing a Disease Enrichment

The disease_enrichment function receives a list of genes and performs an enrichment analysis over the diseases in DisGeNET.

The input list of genes should be identified with HGNC symbols, or Entrez Gene Identifiers. The vocabulary should be specified using the parameter vocabulary. By default, vocabulary = "HGNC".

The function has other optional arguments: the source database (by default, database = “CURATED”), and a list of genes to be used as universe for the Fisher test. If no universe is supplied, by default the function will use the genes contained in the specified database (in the example, all genes in DisGeNET curated).

To perform the enrichment, run:

res_enrich <-disease_enrichment( genes =list_of_genes, vocabulary = "HGNC", database = "CURATED")

See the full vignette here