다중 작업 학습을 통한 문장 유사도 기반 단락 재순위화 방법

기계독해 시스템은 컴퓨터가 주어진 단락을 이해하고 질문에 대한 답변을 하는 질의응답 시스템이다. 최근 심층 신경망의 발전으로 기계독해 시스템의 연구가 활발해지면서 주어진 문서가 아닌 검색 모델의 결과에서 정답을 찾는 연구(오픈 도메인 기계독해 시스템)가 진행되고 있다. 하지만 오픈 기계독해 시스템은 검색 모델이 정답을 포함하는 단락을 검색해오지 못할 경우, 질문에 대한 답을 할 수 없다. 즉, 오픈 도메인 기계독해 시스템의 성능은 검색 모델의 성능에 종속된다. 따라서 오픈 도메인 기계독해 시스템이 높은 성능을 기록하기 위해서는 높은...

Full description

Saved in:

Bibliographic Details
Published in	Chŏngbo Kwahakhoe nonmunji pp. 416 - 421
Main Authors	장영진, 이현구, 왕지현, 이충희, 김학수
Format	Journal Article
Language	Korean
Published	한국정보과학회 01.04.2020
Subjects	컴퓨터학
Online Access	Get full text
ISSN	2383-630X 2383-6296
DOI	10.5626/JOK.2020.47.4.416

Cover

More Information
Summary:	기계독해 시스템은 컴퓨터가 주어진 단락을 이해하고 질문에 대한 답변을 하는 질의응답 시스템이다. 최근 심층 신경망의 발전으로 기계독해 시스템의 연구가 활발해지면서 주어진 문서가 아닌 검색 모델의 결과에서 정답을 찾는 연구(오픈 도메인 기계독해 시스템)가 진행되고 있다. 하지만 오픈 기계독해 시스템은 검색 모델이 정답을 포함하는 단락을 검색해오지 못할 경우, 질문에 대한 답을 할 수 없다. 즉, 오픈 도메인 기계독해 시스템의 성능은 검색 모델의 성능에 종속된다. 따라서 오픈 도메인 기계독해 시스템이 높은 성능을 기록하기 위해서는 높은 성능의 검색 모델이 요구된다. 검색 모델의 성능을 높이기 위한 기존 연구는 질의 확장과 재순위화 등을 통해 연구되었으며, 본 논문에서는 심층 신경망을 이용한 재순위화 방법을 제안한다. 제안 모델은 다중 작업 학습 기반 문장 유사도 측정을 통해 검색 결과(단락)를 재순위화하고, 자체 구축한 58,980 쌍의 기계독해 데이터의 실험 결과로 기존 검색 모델 성능과 비교하여 약 8%p(Precision 1 기준)의 성능 향상을 보였다. The machine reading comprehension(MRC) system is a question answering system in which a computer understands a given passage and respond questions. Recently, with the development of the deep neural network, research on the machine reading system has been actively conducted, and the open domain machine reading system that identifies the correct answer from the results of the information retrieval(IR) model rather than the given passage is in progress. However, if the IR model fails to identify a passage comprising the correct answer, the MRC system cannot respond to the question. That is, the performance of the open domain MRC system depends on the performance of the IR model. Thus, for an open domain MRC system to record high performance, a high performance IR model must be preceded. The previous IR model has been studied through query expansion and reranking. In this paper, we propose a re-ranking method using deep neural networks. The proposed model re-ranks the retrieval results (passages) through multi-task learning-based sentence similarity, and improves the performance by approximately 8% compared to the performance of the existing IR model with experimental results of 58,980 pairs of MRC data. KCI Citation Count: 0
ISSN:	2383-630X 2383-6296
DOI:	10.5626/JOK.2020.47.4.416