Discovering and Summarizing Relationships Between Chemicals, Genes, Proteins, and Diseases in PubChem

The literature knowledge panels developed and implemented in PubChem are described. These help to uncover and summarize important relationships between chemicals, genes, proteins, and diseases by analyzing co-occurrences of terms in biomedical literature abstracts. Named entities in PubMed records a...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in research metrics and analytics Vol. 6; p. 689059
Main Authors Zaslavsky, Leonid, Cheng, Tiejun, Gindulyte, Asta, He, Siqian, Kim, Sunghwan, Li, Qingliang, Thiessen, Paul, Yu, Bo, Bolton, Evan E.
Format Journal Article
LanguageEnglish
Published Frontiers Media S.A 12.07.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The literature knowledge panels developed and implemented in PubChem are described. These help to uncover and summarize important relationships between chemicals, genes, proteins, and diseases by analyzing co-occurrences of terms in biomedical literature abstracts. Named entities in PubMed records are matched with chemical names in PubChem, disease names in Medical Subject Headings (MeSH), and gene/protein names in popular gene/protein information resources, and the most closely related entities are identified using statistical analysis and relevance-based sampling. Knowledge panels for the co-occurrence of chemical, disease, and gene/protein entities are included in PubChem Compound, Protein, and Gene pages, summarizing these in a compact form. Statistical methods for removing redundancy and estimating relevance scores are discussed, along with benefits and pitfalls of relying on automated (i.e., not human-curated) methods operating on data from multiple heterogeneous sources.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Reviewed by: Bridget McInnes, Virginia Commonwealth University, United States
Leonid Zaslavsky
Nansu Zong, Mayo Clinic, United States
Siqian He
ORCID
orcid.org/0000-0002-1992-2086
This article was submitted to Text-mining and Literature-based Discovery, a section of the journal Frontiers in Research Metrics and Analytics
orcid.org/0000-0002-4486-3356
orcid.org/0000-0002-5959-6190
orcid.org/0000-0001-5873-4873
orcid.org/0000-0002-6453-236X
orcid.org/0000-0003-3952-8921
Edited by: Karin Verspoor, RMIT University, Australia
orcid.org/0000-0001-9600-5305
Evan Bolton
Tiejun Cheng
Asta Gindulyte
orcid.org/0000-0002-1707-4167
Bo Yu
Qingliang Li
Paul Thiessen
Sunghwan Kim
orcid.org/0000-0001-9828-2074
ISSN:2504-0537
2504-0537
DOI:10.3389/frma.2021.689059