The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text

Interpretation of semantic propositions in free-text documents such as MEDLINE citations would provide valuable support for biomedical applications, and several approaches to semantic interpretation are being pursued in the biomedical informatics community. In this paper, we describe a methodology f...

Full description

Saved in:

Bibliographic Details
Published in	Journal of biomedical informatics Vol. 36; no. 6; pp. 462 - 477
Main Authors	Rindflesch, Thomas C, Fiszman, Marcelo
Format	Journal Article
Language	English
Published	United States Elsevier Inc 01.12.2003
Subjects	Abstracting and Indexing as Topic - methods Algorithms Artificial Intelligence Database Management Systems Databases, Factual Information extraction Information Storage and Retrieval - methods Knowledge representation Linguistics National Library of Medicine (U.S.) Natural Language Processing Semantic processing Semantics Subject Headings Terminology as Topic Unified Medical Language System United States United States Semantic processing Natural language processing Information extraction Knowledge representation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Interpretation of semantic propositions in free-text documents such as MEDLINE citations would provide valuable support for biomedical applications, and several approaches to semantic interpretation are being pursued in the biomedical informatics community. In this paper, we describe a methodology for interpreting linguistic structures that encode hypernymic propositions, in which a more specific concept is in a taxonomic relationship with a more general concept. In order to effectively process these constructions, we exploit underspecified syntactic analysis and structured domain knowledge from the Unified Medical Language System (UMLS). After introducing the syntactic processing on which our system depends, we focus on the UMLS knowledge that supports interpretation of hypernymic propositions. We first use semantic groups from the Semantic Network to ensure that the two concepts involved are compatible; hierarchical information in the Metathesaurus then determines which concept is more general and which more specific. A preliminary evaluation of a sample based on the semantic group Chemicals and Drugs provides 83% precision. An error analysis was conducted and potential solutions to the problems encountered are presented. The research discussed here serves as a paradigm for investigating the interaction between domain knowledge and linguistic structure in natural language processing, and could also make a contribution to research on automatic processing of discourse structure. Additional implications of the system we present include its integration in advanced semantic interpretation processors for biomedical text and its use for information extraction in specific domains. The approach has the potential to support a range of applications, including information retrieval and ontology engineering.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
ISSN:	1532-0464 1532-0480
DOI:	10.1016/j.jbi.2003.11.003