Learning to extract domain-specific relations from complex sentences
•We propose SemIE, Semantic-based Information Extraction and Mapping.•Our approach identifies significant relations and maps them to a semantic structure.•Our approach bootstraps training examples from a pair of structured documents.•The results show our approach outperforms current state-of-the-art...
Saved in:
Published in | Expert systems with applications Vol. 60; pp. 107 - 117 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
30.10.2016
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •We propose SemIE, Semantic-based Information Extraction and Mapping.•Our approach identifies significant relations and maps them to a semantic structure.•Our approach bootstraps training examples from a pair of structured documents.•The results show our approach outperforms current state-of-the-art system.•The results prove the effectiveness of our approach in handling complex sentences.
Open Information Extraction (OIE) systems focus on identifying and extracting general relations from text. Most OIE systems utilize simple linguistic structure, such as part-of-speech or dependency features, to extract relations and arguments from a sentence. These approaches are simple and fast to implement, but suffer from two main drawbacks: i) they are less effective to handle complex sentences with multiple relations and shared arguments, and ii) they tend to extract overly-specific relations.
This paper proposes an approach to Information Extraction called SemIE, which addresses both drawbacks. SemIE identifies significant relations from domain-specific text by utilizing a semantic structure that describes the domain of discourse. SemIE exploits the predicate-argument structure of a text, which is able to handle complex sentences. The semantics of the arguments are explicitly specified by mapping them to relevant concepts in the semantic structure.
SemIE uses a semi-supervised learning approach to bootstrap training examples that cover all relations expressed in the semantic structure. SemIE inputs pairs of structured documents and uses a Greedy Mapping module to bootstrap a full set of training examples. The training examples are then used to learn the extraction and mapping rules.
We evaluated the performance of SemIE by comparing it with OLLIE, a state-of-the-art OIE system. We tested SemIE and OLLIE on the task of extracting relations from text in the “movie” domain and found that on average, SemIE outperforms OLLIE. Furthermore, we also examined how the performance varies with sentence complexity and sentence length. The results prove the effectiveness of SemIE in handling complex sentences. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2016.05.004 |