学习文本生成的稀疏原型

论文标题

学习文本生成的稀疏原型

Learning Sparse Prototypes for Text Generation

论文作者

He, Junxian, Berg-Kirkpatrick, Taylor, Neubig, Graham

论文摘要

原型驱动的文本生成使用非参数模型，该模型首先从句子“原型”库中进行选择，然后修改原型以生成输出文本。尽管有效，但由于需要存储和索引整个培训语料库，因此这些方法在测试时间效率低下。此外，现有方法通常需要启发式方法来确定在培训时要参考哪些原型。在本文中，我们提出了一个新颖的生成模型，该模型会自动学习一个稀疏的原型支持集，尽管如此，它仍可以实现强大的语言建模性能。这是通过（1）在原型选择分布上施加稀疏性的先验，以及（2）利用摊销的变分推断来学习原型检索函数。在实验中，我们的模型的表现优于先前的原型驱动语言模型，同时达到了1000倍的记忆减少以及测试时的1000倍加速。更有趣的是，我们表明，当我们改变原型选择的稀疏性时，学到的原型能够以不同的粒度捕获语义和语法，并且可以通过指定生成原型来控制某些句子属性。

Prototype-driven text generation uses non-parametric models that first choose from a library of sentence "prototypes" and then modify the prototype to generate the output text. While effective, these methods are inefficient at test time as a result of needing to store and index the entire training corpus. Further, existing methods often require heuristics to identify which prototypes to reference at training time. In this paper, we propose a novel generative model that automatically learns a sparse prototype support set that, nonetheless, achieves strong language modeling performance. This is achieved by (1) imposing a sparsity-inducing prior on the prototype selection distribution, and (2) utilizing amortized variational inference to learn a prototype retrieval function. In experiments, our model outperforms previous prototype-driven language models while achieving up to a 1000x memory reduction, as well as a 1000x speed-up at test time. More interestingly, we show that the learned prototypes are able to capture semantics and syntax at different granularity as we vary the sparsity of prototype selection, and that certain sentence attributes can be controlled by specifying the prototype for generation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题