A support vector machine-based context-ranking model for question answering

Modern information technologies and Internet services are suffering from the problem of selecting and managing a growing amount of textual information, to which access is often critical. Machine learning techniques have recently shown excellent performance and flexibility in many applications, such...

Full description

Saved in:
Bibliographic Details
Published inInformation sciences Vol. 224; pp. 77 - 87
Main Authors Yen, Show-Jane, Wu, Yu-Chieh, Yang, Jie-Chi, Lee, Yue-Shi, Lee, Chung-Jung, Liu, Jui-Jung
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.03.2013
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Modern information technologies and Internet services are suffering from the problem of selecting and managing a growing amount of textual information, to which access is often critical. Machine learning techniques have recently shown excellent performance and flexibility in many applications, such as artificial intelligence and pattern recognition. Question answering (QA) is a method of locating exact answer sentences from vast document collections. This paper presents a machine learning-based question-answering framework, which integrates a question classifier, simple document/passage retrievers, and the proposed context-ranking models. The question classifier is trained to categorize the answer type of the given question and instructs the context-ranking model to re-rank the passages retrieved from the initial retrievers. This method provides flexible features to learners, such as word forms, syntactic features, and semantic word features. The proposed context-ranking model, which is based on the sequential labeling of tasks, combines rich features to predict whether the input passage is relevant to the question type. We employ TREC-QA tracks and question classification benchmarks to evaluate the proposed method. The experimental results show that the question classifier achieves 85.60% accuracy without any additional semantic or syntactic taggers, and reached 88.60% after we employed the proposed term expansion techniques and a predefined related-word set. In the TREC-10 QA task, by using the gold TREC-provided relevant document set, the QA model achieves a 0.563 mean reciprocal rank (MRR) score, and a 0.342 MRR score is achieved after using the simple document and passage retrieval algorithms.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2012.10.014