扩散LM改善可控文本生成

论文标题

扩散LM改善可控文本生成

Diffusion-LM Improves Controllable Text Generation

论文作者

Li, Xiang Lisa, Thickstun, John, Gulrajani, Ishaan, Liang, Percy, Hashimoto, Tatsunori B.

论文摘要

在不重新训练的情况下控制语言模型（LMS）的行为是自然语言产生的主要开放问题。尽管最近的作品证明了控制简单句子属性（例如，情感）的成功，但在复杂的，细粒度的控件（例如，句法结构）上几乎没有进展。为了应对这一挑战，我们基于我们称为扩散lm的连续扩散而开发了一种新的非自动性语言模型。基于扩散模型在连续域中的最新成功之后，扩散lm迭代将一系列高斯向量变为单词向量，从而产生了一系列中间的潜在变量。这些中间变量的连续，分层性质使一种简单的基于梯度的算法能够执行复杂，可控制的生成任务。我们证明了对六项具有挑战性的细粒控制任务的扩散LM的成功控制，这显着超过了先前的工作。

Controlling the behavior of language models (LMs) without re-training is a major open problem in natural language generation. While recent works have demonstrated successes on controlling simple sentence attributes (e.g., sentiment), there has been little progress on complex, fine-grained controls (e.g., syntactic structure). To address this challenge, we develop a new non-autoregressive language model based on continuous diffusions that we call Diffusion-LM. Building upon the recent successes of diffusion models in continuous domains, Diffusion-LM iteratively denoises a sequence of Gaussian vectors into word vectors, yielding a sequence of intermediate latent variables. The continuous, hierarchical nature of these intermediate variables enables a simple gradient-based algorithm to perform complex, controllable generation tasks. We demonstrate successful control of Diffusion-LM for six challenging fine-grained control tasks, significantly outperforming prior work.

下载PDF全文

下载文献需遵守相关版权规定

论文标题