任务感知的专业化，以进行有效且稳健的密集检索以进行开放域问题回答

论文标题

任务感知的专业化，以进行有效且稳健的密集检索以进行开放域问题回答

Task-Aware Specialization for Efficient and Robust Dense Retrieval for Open-Domain Question Answering

论文作者

Cheng, Hao, Fang, Hao, Liu, Xiaodong, Gao, Jianfeng

论文摘要

鉴于其在知识密集的自然语言处理任务上的有效性，密集的检索模型变得越来越流行。具体而言，开放域问题的事实上的架构回答使用了两个同构编码器，这些编码器是从相同的验证模型初始初始化的，但分别参数为问题和段落。此双重编码器体系结构具有参数indeffic，因为编码器之间没有参数共享。此外，最近的研究表明，在各种情况下，这种密集的检索员的表现不佳。因此，我们为密集检索（TASER）提出了一个新的体系结构，任务感知的专业化，该专业化是通过在单个编码器中交织共享和专业块来实现参数共享。我们对五个问题回答数据集的实验表明，Taser可以实现卓越的准确性，超过BM25，而将约60％的参数用作Bi-nocoder密集的检索器。在室外评估中，泰瑟犬在经验上也比双重编码器密集的猎犬更健壮。我们的代码可在https://github.com/microsoft/taser上找到。

Given its effectiveness on knowledge-intensive natural language processing tasks, dense retrieval models have become increasingly popular. Specifically, the de-facto architecture for open-domain question answering uses two isomorphic encoders that are initialized from the same pretrained model but separately parameterized for questions and passages. This bi-encoder architecture is parameter-inefficient in that there is no parameter sharing between encoders. Further, recent studies show that such dense retrievers underperform BM25 in various settings. We thus propose a new architecture, Task-aware Specialization for dense Retrieval (TASER), which enables parameter sharing by interleaving shared and specialized blocks in a single encoder. Our experiments on five question answering datasets show that TASER can achieve superior accuracy, surpassing BM25, while using about 60% of the parameters as bi-encoder dense retrievers. In out-of-domain evaluations, TASER is also empirically more robust than bi-encoder dense retrievers. Our code is available at https://github.com/microsoft/taser.

下载PDF全文

下载文献需遵守相关版权规定

论文标题