MuSiQue: Multihop Questions via Single-hop Question Composition
Multihop reasoning remains an elusive goal as existing multihop benchmarks are known to be largely solvable via shortcuts. Can we create a question answering (QA) dataset that, by construction, proper multihop reasoning? To this end, we introduce a bottom–up approach that systematically selects comp...
Saved in:
Published in | Transactions of the Association for Computational Linguistics Vol. 10; pp. 539 - 554 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
One Broadway, 12th Floor, Cambridge, Massachusetts 02142, USA
MIT Press
04.05.2022
MIT Press Journals, The The MIT Press |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Multihop reasoning remains an elusive goal as existing multihop benchmarks are known to be largely solvable via shortcuts. Can we create a question answering (QA) dataset that, by construction,
proper multihop reasoning? To this end, we introduce a bottom–up approach that systematically selects composable pairs of single-hop questions that are connected, that is, where one reasoning step critically relies on information from another. This bottom–up methodology lets us explore a vast space of questions and add stringent filters as well as other mechanisms targeting connected reasoning. It provides fine-grained control over the construction process and the properties of the resulting
-hop questions. We use this methodology to create MuSiQue-Ans, a new multihop QA dataset with 25K 2–4 hop questions. Relative to existing datasets, MuSiQue-Ans is more difficult overall (3× increase in human–machine gap), and harder to cheat via disconnected reasoning (e.g., a single-hop model has a 30-point drop in F1). We further add unanswerable contrast questions to produce a more stringent dataset, MuSiQue-Full. We hope our datasets will help the NLP community develop models that perform genuine multihop reasoning. |
---|---|
Bibliography: | 2022 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 2307-387X 2307-387X |
DOI: | 10.1162/tacl_a_00475 |