COVID-19 Knowledge Extractor (COKE): A Tool and a Web Portal to Extract Drug - Target Protein Associations from the CORD-19 Corpus of Scientific Publications on COVID-19

The COVID-19 pandemic has catalyzed a widespread effort to identify drug candidates and biological targets of relevance to SARS-COV-2 infection, which resulted in large numbers of publications on this subject. We have built the VID-19 nowledge xtractor (COKE), a web application to extract, curate, a...

Full description

Saved in:
Bibliographic Details
Published inChemRxiv
Main Authors Korn, Daniel, Pervitsky, Vera, Bobrowski, Tesia, Alves, Vinicius M, Schmitt, Charles, Bizon, Chris, Baker, Nancy, Chirkova, Rada, Cherkasov, Artem, Muratov, Eugene, Tropsha, Alexander
Format Journal Article Paper
LanguageEnglish
Published United States 26.11.2020
Edition1
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The COVID-19 pandemic has catalyzed a widespread effort to identify drug candidates and biological targets of relevance to SARS-COV-2 infection, which resulted in large numbers of publications on this subject. We have built the VID-19 nowledge xtractor (COKE), a web application to extract, curate, and annotate essential drug-target relationships from the research literature on COVID-19 to assist drug repurposing efforts. SciBiteAI ontological tagging of the COVID Open Research Dataset (CORD-19), a repository of COVID-19 scientific publications, was employed to identify drug-target relationships. Entity identifiers were resolved through lookup routines using UniProt and DrugBank. A custom algorithm was used to identify co-occurrences of protein and drug terms, and confidence scores were calculated for each entity pair. COKE processing of the current CORD-19 database identified about 3,000 drug-protein pairs, including 29 unique proteins and 500 investigational, experimental, and approved drugs. Some of these drugs are presently undergoing clinical trials for COVID-19. The rapidly evolving situation concerning the COVID-19 pandemic has resulted in a dramatic growth of publications on this subject in a short period. These circumstances call for methods that can condense the literature into the key concepts and relationships necessary for insights into SARS-CoV-2 drug repurposing. The COKE repository and web application deliver key drug - target protein relationships to researchers studying SARS-CoV-2. COKE portal may provide comprehensive and critical information on studies concerning drug repurposing against COVID-19. COKE is freely available at https://coke.mml.unc.edu/ and the code is available at https://github.com/DnlRKorn/CoKE .
Bibliography:The authors declare no conflicts of interest.
ISSN:2573-2293
DOI:10.26434/chemrxiv.13289222.v1