论文标题

GAPX:广泛的自动释放术识别x

GAPX: Generalized Autoregressive Paraphrase-Identification X

论文作者

Zhou, Yifei, Li, Renyu, Housen, Hayden, Lim, Ser-Nam

论文摘要

解释识别是自然语言处理中的一项基本任务。尽管该领域已经取得了很多进展,但许多最先进模型的表现通常会在推理时间内遭受分配变化。我们验证这种性能下降的主要来源来自负面示例引入的偏见。为了克服这些偏见,我们在本文中建议训练两个单独的模型,一种模型仅利用正对,另一个使用负对。这使我们可以选择要使用负面模型的数量,为此,我们引入了基于困惑的分布度度量,我们显示的可以有效并自动确定推理过程中应给出多少重量。我们以强大的经验结果来支持我们的发现。

Paraphrase Identification is a fundamental task in Natural Language Processing. While much progress has been made in the field, the performance of many state-of-the-art models often suffer from distribution shift during inference time. We verify that a major source of this performance drop comes from biases introduced by negative examples. To overcome these biases, we propose in this paper to train two separate models, one that only utilizes the positive pairs and the other the negative pairs. This enables us the option of deciding how much to utilize the negative model, for which we introduce a perplexity based out-of-distribution metric that we show can effectively and automatically determine how much weight it should be given during inference. We support our findings with strong empirical results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源