论文标题

数据到文本生成的变分模板机

Variational Template Machine for Data-to-Text Generation

论文作者

Ye, Rong, Shi, Wenxian, Zhou, Hao, Wei, Zhongyu, Li, Lei

论文摘要

如何从表中组织的结构化数据中生成描述?使用神经编码器模型的现有方法通常会缺乏多样性。我们声称,一组开放的模板对于丰富短语构造和实现多样化的世代至关重要。学习这样的模板是过于刺激的,因为它通常需要大型配对<table,Description>语料库,这很少可用。本文探讨了从配对和非配对数据中自动学习可重复使用的“模板”的问题。我们提出了变分模板机(VTM),这是一种新的方法,用于从数据表中生成文本说明。我们的贡献包括:a)我们在潜在空间中仔细设计了特定的模型架构和损失,以明确地解开文本模板和语义内容信息,b)我们同时使用了小的平行数据和大型原始文本,而无需对齐表来丰富模板学习。来自各种不同领域的数据集上的实验表明,VTM能够更加多样化,同时保持良好的流畅性和质量。

How to generate descriptions from structured data organized in tables? Existing approaches using neural encoder-decoder models often suffer from lacking diversity. We claim that an open set of templates is crucial for enriching the phrase constructions and realizing varied generations. Learning such templates is prohibitive since it often requires a large paired <table, description> corpus, which is seldom available. This paper explores the problem of automatically learning reusable "templates" from paired and non-paired data. We propose the variational template machine (VTM), a novel method to generate text descriptions from data tables. Our contributions include: a) we carefully devise a specific model architecture and losses to explicitly disentangle text template and semantic content information, in the latent spaces, and b)we utilize both small parallel data and large raw text without aligned tables to enrich the template learning. Experiments on datasets from a variety of different domains show that VTM is able to generate more diversely while keeping a good fluency and quality.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源