UnifiedQA：与单个质量检查系统的跨越格式边界

论文标题

UnifiedQA：与单个质量检查系统的跨越格式边界

UnifiedQA: Crossing Format Boundaries With a Single QA System

论文作者

Khashabi, Daniel, Min, Sewon, Khot, Tushar, Sabharwal, Ashish, Tafjord, Oyvind, Clark, Peter, Hajishirzi, Hannaneh

论文摘要

问题回答（QA）任务是使用多种格式提出的，例如提取跨度选择，多项选择等。这导致了格式特殊模型，甚至导致了QA社区中的隐式划分。我们认为，考虑到我们寻求教学的推理能力不受格式的控制，这种界限是人为的，也许是不必要的。作为证据，我们使用语言建模方面的最新进展来构建单个预训练的QA模型UnifiedQA，该模型在涉及4种不同格式的17个QA数据集中表现出色。 UnifiedQA以9种不同的模型本身培训的9种不同型号的表现。即使面对12个看不见的观测格式数据集，UnifiedQA的性能也出奇地表现出色，从其格式外训练数据中表现出强烈的概括。最后，只需将此预先训练的质量检查模型微调到专门模型中，就可以在6个数据集中获得新的最新技术，从而确立UnifyQA作为构建QA系统的强大起点。

Question answering (QA) tasks have been posed using a variety of formats, such as extractive span selection, multiple choice, etc. This has led to format-specialized models, and even to an implicit division in the QA community. We argue that such boundaries are artificial and perhaps unnecessary, given the reasoning abilities we seek to teach are not governed by the format. As evidence, we use the latest advances in language modeling to build a single pre-trained QA model, UnifiedQA, that performs surprisingly well across 17 QA datasets spanning 4 diverse formats. UnifiedQA performs on par with 9 different models that were trained on individual datasets themselves. Even when faced with 12 unseen datasets of observed formats, UnifiedQA performs surprisingly well, showing strong generalization from its out-of-format training data. Finally, simply fine-tuning this pre-trained QA model into specialized models results in a new state of the art on 6 datasets, establishing UnifiedQA as a strong starting point for building QA systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题