Semantic Matching Against a Corpus: New Applications and Methods

We consider the case of a domain expert who wishes to explore the extent to which a particular idea is expressed in a text collection. We propose the task of semantically matching the idea, expressed as a natural language proposition, against a corpus. We create two preliminary tasks derived from ex...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Lin, Lucy H, Miles, Scott, Smith, Noah A
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 28.08.2018
Subjects
Online AccessGet full text
ISSN2331-8422

Cover

Loading…
More Information
Summary:We consider the case of a domain expert who wishes to explore the extent to which a particular idea is expressed in a text collection. We propose the task of semantically matching the idea, expressed as a natural language proposition, against a corpus. We create two preliminary tasks derived from existing datasets, and then introduce a more realistic one on disaster recovery designed for emergency managers, whom we engaged in a user study. On the latter, we find that a new model built from natural language entailment data produces higher-quality matches than simple word-vector averaging, both on expert-crafted queries and on ones produced by the subjects themselves. This work provides a proof-of-concept for such applications of semantic matching and illustrates key challenges.
Bibliography:content type line 50
SourceType-Working Papers-1
ObjectType-Working Paper/Pre-Print-1
ISSN:2331-8422