Rainier：常识性问题的增强知识的内省者回答

论文标题

Rainier：常识性问题的增强知识的内省者回答

Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering

论文作者

Liu, Jiacheng, Hallinan, Skyler, Lu, Ximing, He, Pengfei, Welleck, Sean, Hajishirzi, Hannaneh, Choi, Yejin

论文摘要

知识是推理的基础。最近的研究表明，当提供相关知识作为常识性问题回答（QA）的附加背景时，即使在最新的最新情况下，它也可以大大提高性能。基本的挑战是如何找到高质量的知识以及有关问题的重要性；从知识库中检测到的知识是不完整的，从语言模型产生的知识是不一致的。我们介绍了雷尼尔（Rainier）或增强知识的内省者，该知识学会了以回答给定的问题来产生上下文相关的知识。我们的方法首先模仿GPT-3产生的知识，然后学会通过强化学习来产生自己的知识，在该学习中，奖励是根据对结果的回答的提高绩效来塑造的。当对9个不同的常识基准测试时，雷尼尔（Rainier）表现出可观且一致的性能提高：包括在模型培训期间看到的5个数据集以及4个保持看不见的数据集。我们的工作是第一个报告说，即使没有对知识本身的直接监督，模型产生的知识也可以超过GPT-3引起的常识性知识的质量。

Knowledge underpins reasoning. Recent research demonstrates that when relevant knowledge is provided as additional context to commonsense question answering (QA), it can substantially enhance the performance even on top of state-of-the-art. The fundamental challenge is where and how to find such knowledge that is high quality and on point with respect to the question; knowledge retrieved from knowledge bases are incomplete and knowledge generated from language models are inconsistent. We present Rainier, or Reinforced Knowledge Introspector, that learns to generate contextually relevant knowledge in response to given questions. Our approach starts by imitating knowledge generated by GPT-3, then learns to generate its own knowledge via reinforcement learning where rewards are shaped based on the increased performance on the resulting question answering. Rainier demonstrates substantial and consistent performance gains when tested over 9 different commonsense benchmarks: including 5 datasets that are seen during model training, as well as 4 datasets that are kept unseen. Our work is the first to report that knowledge generated by models that are orders of magnitude smaller than GPT-3, even without direct supervision on the knowledge itself, can exceed the quality of commonsense knowledge elicited from GPT-3.

下载PDF全文

下载文献需遵守相关版权规定

论文标题