论文标题

在生成问题回答中理解和改善零击的多跳跃推理

Understanding and Improving Zero-shot Multi-hop Reasoning in Generative Question Answering

论文作者

Jiang, Zhengbao, Araki, Jun, Ding, Haibo, Neubig, Graham

论文摘要

生成问题回答(QA)模型仅根据模型的参数(封装设置)或还检索相关证据(开放式书籍设置)来生成问题的答案。生成的质量检查模型可以回答一些相对复杂的问题,但是他们这样做的机制仍然很少理解。我们进行了几项旨在更好地理解生成质量质量检查模型的多跳跃推理能力的研究。首先,我们将多跳问题分解为多个相应的单跳问题,并在质量上相同的问题链中发现QA模型的答案中有明显的不一致。其次,我们发现模型缺乏零射击多跳的推理能力:仅在单跳问题上接受培训时,模型将概括为多跳的问题。最后,我们证明,通过两种方法,可以通过培训单跳问题的串联或逻辑形式(SPARQL)来提高模型的零击多跳跃推理能力。总而言之,这些结果表明,在生成质量检查模型中,多跳跃推理并不能自然出现,但可以通过训练或建模技术的进步来鼓励。

Generative question answering (QA) models generate answers to questions either solely based on the parameters of the model (the closed-book setting) or additionally retrieving relevant evidence (the open-book setting). Generative QA models can answer some relatively complex questions, but the mechanism through which they do so is still poorly understood. We perform several studies aimed at better understanding the multi-hop reasoning capabilities of generative QA models. First, we decompose multi-hop questions into multiple corresponding single-hop questions, and find marked inconsistency in QA models' answers on these pairs of ostensibly identical question chains. Second, we find that models lack zero-shot multi-hop reasoning ability: when trained only on single-hop questions, models generalize poorly to multi-hop questions. Finally, we demonstrate that it is possible to improve models' zero-shot multi-hop reasoning capacity through two methods that approximate real multi-hop natural language (NL) questions by training on either concatenation of single-hop questions or logical forms (SPARQL). In sum, these results demonstrate that multi-hop reasoning does not emerge naturally in generative QA models, but can be encouraged by advances in training or modeling techniques.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源