Sequence tagging for biomedical extractive question answering
Abstract Motivation Current studies in extractive question answering (EQA) have modeled the single-span extraction setting, where a single answer span is a label to predict for a given question-passage pair. This setting is natural for general domain EQA as the majority of the questions in the gener...
Saved in:
Published in | Bioinformatics Vol. 38; no. 15; pp. 3794 - 3801 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
England
Oxford University Press
02.08.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Abstract
Motivation
Current studies in extractive question answering (EQA) have modeled the single-span extraction setting, where a single answer span is a label to predict for a given question-passage pair. This setting is natural for general domain EQA as the majority of the questions in the general domain can be answered with a single span. Following general domain EQA models, current biomedical EQA (BioEQA) models utilize the single-span extraction setting with post-processing steps.
Results
In this article, we investigate the question distribution across the general and biomedical domains and discover biomedical questions are more likely to require list-type answers (multiple answers) than factoid-type answers (single answer). This necessitates the models capable of producing multiple answers for a question. Based on this preliminary study, we propose a sequence tagging approach for BioEQA, which is a multi-span extraction setting. Our approach directly tackles questions with a variable number of phrases as their answer and can learn to decide the number of answers for a question from training data. Our experimental results on the BioASQ 7b and 8b list-type questions outperformed the best-performing existing models without requiring post-processing steps.
Availability and implementation
Source codes and resources are freely available for download at https://github.com/dmis-lab/SeqTagQA.
Supplementary information
Supplementary data are available at Bioinformatics online. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 This work was done while Wonjin Yoon worked under the Research Collaboration project at AstraZeneca. |
ISSN: | 1367-4803 1460-2059 1367-4811 |
DOI: | 10.1093/bioinformatics/btac397 |