The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text

Interpretation of semantic propositions in free-text documents such as MEDLINE citations would provide valuable support for biomedical applications, and several approaches to semantic interpretation are being pursued in the biomedical informatics community. In this paper, we describe a methodology f...

Full description

Saved in:
Bibliographic Details
Published inJournal of biomedical informatics Vol. 36; no. 6; pp. 462 - 477
Main Authors Rindflesch, Thomas C, Fiszman, Marcelo
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 01.12.2003
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Interpretation of semantic propositions in free-text documents such as MEDLINE citations would provide valuable support for biomedical applications, and several approaches to semantic interpretation are being pursued in the biomedical informatics community. In this paper, we describe a methodology for interpreting linguistic structures that encode hypernymic propositions, in which a more specific concept is in a taxonomic relationship with a more general concept. In order to effectively process these constructions, we exploit underspecified syntactic analysis and structured domain knowledge from the Unified Medical Language System (UMLS). After introducing the syntactic processing on which our system depends, we focus on the UMLS knowledge that supports interpretation of hypernymic propositions. We first use semantic groups from the Semantic Network to ensure that the two concepts involved are compatible; hierarchical information in the Metathesaurus then determines which concept is more general and which more specific. A preliminary evaluation of a sample based on the semantic group Chemicals and Drugs provides 83% precision. An error analysis was conducted and potential solutions to the problems encountered are presented. The research discussed here serves as a paradigm for investigating the interaction between domain knowledge and linguistic structure in natural language processing, and could also make a contribution to research on automatic processing of discourse structure. Additional implications of the system we present include its integration in advanced semantic interpretation processors for biomedical text and its use for information extraction in specific domains. The approach has the potential to support a range of applications, including information retrieval and ontology engineering.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:1532-0464
1532-0480
DOI:10.1016/j.jbi.2003.11.003