使用语言模型歧视器和共同信息最大化的半监督形式样式转移

论文标题

使用语言模型歧视器和共同信息最大化的半监督形式样式转移

Semi-supervised Formality Style Transfer using Language Model Discriminator and Mutual Information Maximization

论文作者

Chawla, Kunal, Yang, Diyi

论文摘要

形式样式转移是将非正式句子转换为语法校正正式句子的任务，该句子可用于改善许多下游NLP任务的性能。在这项工作中，我们提出了一种半监督的形式化样式转移模型，该模型利用基于语言模型的判别器来最大化输出句子的可能性是正式的，这使我们能够最大化令牌级的条件概率来训练。我们进一步建议将源和目标样式之间的共同信息作为我们的培训目标，而不是最大程度地提高常规可能性，而规则可能会导致重复和微不足道的响应。实验表明，在自动指标和人类判断方面，我们的模型的表现均优于先前的最新基线。我们将模型进一步概括为无监督的文本样式转移任务，并在两个基准情感样式转移数据集上取得了重大改进。

Formality style transfer is the task of converting informal sentences to grammatically-correct formal sentences, which can be used to improve performance of many downstream NLP tasks. In this work, we propose a semi-supervised formality style transfer model that utilizes a language model-based discriminator to maximize the likelihood of the output sentence being formal, which allows us to use maximization of token-level conditional probabilities for training. We further propose to maximize mutual information between source and target styles as our training objective instead of maximizing the regular likelihood that often leads to repetitive and trivial generated responses. Experiments showed that our model outperformed previous state-of-the-art baselines significantly in terms of both automated metrics and human judgement. We further generalized our model to unsupervised text style transfer task, and achieved significant improvements on two benchmark sentiment style transfer datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题