COVID-19 DisGeNET data collection

The web shows the results of applying state-of-the art text mining tools developed by MedBioinformatics solutions to the LitCovid dataset (Chen, Allot, and Lu, 2020), to identify mentions of diseases, signs and symptoms. The LitCovid dataset contains a selection of papers referring to Coronavirus 19 disease.

Our text mining tools scan those articles and identify any mention of diseases and phenotypes, together with mentions of the COVID-19 virus. These mentions are normalized to standard vocabularies.

The data is available under license the Attribution-NonCommercial-ShareAlike 4.0 International License whose text can be found here.

For more information, please contact usat support(at)disgenet(dot)org

In the image below, we represent the network of phenotypes more frequently mentioned together with CoVID-19 and with SARS CoV-2.

Data History

  • Version 1.0 released (April 19, 2020)
    • The dataset contains 905 diseases and phenotypes over 4,833 publications.