论文标题
查找备忘录:在约束序列生成任务中的提取性记忆
Finding Memo: Extractive Memorization in Constrained Sequence Generation Tasks
论文作者
论文摘要
记忆提出了几种受约束自然语言生成(NLG)任务(例如神经机器翻译(NMT))的挑战,其中神经模型的倾向是记住嘈杂和非典型样本的倾向,对嘈杂(网页爬行)数据集的反应不利。但是,在受约束NLG任务中记忆的先前研究仅集中在反事实记忆上,将其与幻觉问题联系起来。在这项工作中,我们在约束序列生成任务中提出了一种用于提取记忆的新的,廉价的算法(在不足的上下文下的精确培训数据生成),并使用它来研究提取性记忆及其在NMT中的影响。我们证明,提取性记忆通过定性和定量表征记忆的样本以及其附近的模型行为对NMT的可靠性构成了严重威胁。基于经验观察,我们开发了一种简单的算法,该算法从同一模型中引起了对同一模型的记忆样品的非途径翻译,以提供很大一部分的样品。最后,我们表明,也可以利用提出的算法通过填充来减轻模型中的记忆。我们已经发布了代码以在https://github.com/vyraun/finding-memo上复制结果。
Memorization presents a challenge for several constrained Natural Language Generation (NLG) tasks such as Neural Machine Translation (NMT), wherein the proclivity of neural models to memorize noisy and atypical samples reacts adversely with the noisy (web crawled) datasets. However, previous studies of memorization in constrained NLG tasks have only focused on counterfactual memorization, linking it to the problem of hallucinations. In this work, we propose a new, inexpensive algorithm for extractive memorization (exact training data generation under insufficient context) in constrained sequence generation tasks and use it to study extractive memorization and its effects in NMT. We demonstrate that extractive memorization poses a serious threat to NMT reliability by qualitatively and quantitatively characterizing the memorized samples as well as the model behavior in their vicinity. Based on empirical observations, we develop a simple algorithm which elicits non-memorized translations of memorized samples from the same model, for a large fraction of such samples. Finally, we show that the proposed algorithm could also be leveraged to mitigate memorization in the model through finetuning. We have released the code to reproduce our results at https://github.com/vyraun/Finding-Memo.