通过深度强化学习无监督的释义

论文标题

通过深度强化学习无监督的释义

Unsupervised Paraphrasing via Deep Reinforcement Learning

论文作者

Siddique, A. B., Oymak, Samet, Hristidis, Vagelis

论文摘要

释义在保持流利度（即语法和句法正确性）的同时，以不同的措辞表示输入句子的含义。大多数有关释义的现有工作使用的使用监督模型仅限于特定域（例如，图像标题）。这样的模型既不能直接转移到其他领域，也不能很好地概括，也可以为新领域创建标有标签的培训数据是昂贵且费力的。在许多此类域中，需要跨不同领域的释义以及标记培训数据的稀缺性要求探索无监督的解释方法。我们提出了进步的无监督释义（PUP）：一种基于深度强化学习（DRL）的新型无监督解释方法。 PUP使用变异自动编码器（使用非平行语料库进行训练）来生成温暖启动DRL模型的种子释义。然后，幼崽逐步调整了基于我们新颖的奖励功能引导的种子释义，该奖励功能结合了语义充足性，语言流利度和表达多样性度量，以量化每种迭代中产生的释义的质量，而无需平行句子。我们广泛的实验评估表明，就四个真实数据集的自动指标和用户研究而言，PUP的表现优于无监督的最新措辞技术。我们还表明，PUP在几个数据集上超过了针对域的监督算法。我们的评估还表明，PUP在语义相似性和表达多样性之间取得了巨大的权衡。

Paraphrasing is expressing the meaning of an input sentence in different wording while maintaining fluency (i.e., grammatical and syntactical correctness). Most existing work on paraphrasing use supervised models that are limited to specific domains (e.g., image captions). Such models can neither be straightforwardly transferred to other domains nor generalize well, and creating labeled training data for new domains is expensive and laborious. The need for paraphrasing across different domains and the scarcity of labeled training data in many such domains call for exploring unsupervised paraphrase generation methods. We propose Progressive Unsupervised Paraphrasing (PUP): a novel unsupervised paraphrase generation method based on deep reinforcement learning (DRL). PUP uses a variational autoencoder (trained using a non-parallel corpus) to generate a seed paraphrase that warm-starts the DRL model. Then, PUP progressively tunes the seed paraphrase guided by our novel reward function which combines semantic adequacy, language fluency, and expression diversity measures to quantify the quality of the generated paraphrases in each iteration without needing parallel sentences. Our extensive experimental evaluation shows that PUP outperforms unsupervised state-of-the-art paraphrasing techniques in terms of both automatic metrics and user studies on four real datasets. We also show that PUP outperforms domain-adapted supervised algorithms on several datasets. Our evaluation also shows that PUP achieves a great trade-off between semantic similarity and diversity of expression.

下载PDF全文

下载文献需遵守相关版权规定

论文标题