Integrating Semantic Information into Multiple Kernels for Protein-Protein Interaction Extraction from Biomedical Literatures

Protein-Protein Interaction (PPI) extraction is an important task in the biomedical information extraction. Presently, many machine learning methods for PPI extraction have achieved promising results. However, the performance is still not satisfactory. One reason is that the semantic resources were...

Full description

Saved in:
Bibliographic Details
Published inPloS one Vol. 9; no. 3; p. e91898
Main Authors Li, Lishuang, Zhang, Panpan, Zheng, Tianfu, Zhang, Hongying, Jiang, Zhenchao, Huang, Degen
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 12.03.2014
Public Library of Science (PLoS)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Protein-Protein Interaction (PPI) extraction is an important task in the biomedical information extraction. Presently, many machine learning methods for PPI extraction have achieved promising results. However, the performance is still not satisfactory. One reason is that the semantic resources were basically ignored. In this paper, we propose a multiple-kernel learning-based approach to extract PPIs, combining the feature-based kernel, tree kernel and semantic kernel. Particularly, we extend the shortest path-enclosed tree kernel (SPT) by a dynamic extended strategy to retrieve the richer syntactic information. Our semantic kernel calculates the protein-protein pair similarity and the context similarity based on two semantic resources: WordNet and Medical Subject Heading (MeSH). We evaluate our method with Support Vector Machine (SVM) and achieve an F-score of 69.40% and an AUC of 92.00%, which show that our method outperforms most of the state-of-the-art systems by integrating semantic information.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Competing Interests: The authors have declared that no competing interests exist.
Conceived and designed the experiments: LL ZJ. Performed the experiments: PZ. Analyzed the data: PZ TZ HZ. Contributed reagents/materials/analysis tools: LL TZ. Wrote the paper: PZ ZJ DH. Manuscript revision: LL PZ.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0091898