长篇文本机器的循环分解机制阅读理解

论文标题

长篇文本机器的循环分解机制阅读理解

Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension

论文作者

Gong, Hongyu, Shen, Yelong, Yu, Dian, Chen, Jianshu, Yu, Dong

论文摘要

在本文中，我们在长文本上研究了机器阅读理解（MRC），其中模型将输入作为一个冗长的文档和问题，然后从文档中提取文本跨度作为答案。最先进的模型倾向于使用验证的变压器模型（例如BERT）来编码文档和问题的联合上下文信息。但是，这些基于变压器的模型只能将固定长度（例如512）文本作为其输入。为了处理甚至更长的文本输入，以前的方法通常将它们分解为均等段，并根据每个细分市场独立预测答案，而无需考虑其他细分市场的信息。结果，它们可能形成无法覆盖正确答案跨度或保留周围不足的环境的细分市场，从而大大降低了性能。此外，他们无法回答需要跨段信息的问题。我们建议让模型通过增强学习以更灵活的方式学习：一个模型可以决定要沿任一方向处理的下一个细分市场。我们还采用了经常性的机制来使信息跨细分市场流动。在三个MRC数据集（COQA，QUAC和Triviaqa）上进行的实验证明了我们提出的复发机制的有效性：我们可以获得更有可能包含完整答案的段，同时又提供了围绕地面真理答案的足够环境，以提供更好的预测。

In this paper, we study machine reading comprehension (MRC) on long texts, where a model takes as inputs a lengthy document and a question and then extracts a text span from the document as an answer. State-of-the-art models tend to use a pretrained transformer model (e.g., BERT) to encode the joint contextual information of document and question. However, these transformer-based models can only take a fixed-length (e.g., 512) text as its input. To deal with even longer text inputs, previous approaches usually chunk them into equally-spaced segments and predict answers based on each segment independently without considering the information from other segments. As a result, they may form segments that fail to cover the correct answer span or retain insufficient contexts around it, which significantly degrades the performance. Moreover, they are less capable of answering questions that need cross-segment information. We propose to let a model learn to chunk in a more flexible way via reinforcement learning: a model can decide the next segment that it wants to process in either direction. We also employ recurrent mechanisms to enable information to flow across segments. Experiments on three MRC datasets -- CoQA, QuAC, and TriviaQA -- demonstrate the effectiveness of our proposed recurrent chunking mechanisms: we can obtain segments that are more likely to contain complete answers and at the same time provide sufficient contexts around the ground truth answers for better predictions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题