论文标题
KGPT:知识接地的数据到文本生成的预培训
KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation
论文作者
论文摘要
由于其广泛的应用,数据之间的生成最近引起了巨大的兴趣。现有方法在一系列任务上显示出令人印象深刻的性能。但是,他们依靠每个任务的大量标记数据,这是昂贵的,因此将其应用限制在新任务和域中。在本文中,我们建议利用预训练和转移学习来解决这个问题。我们提出了一个知识培养的预训练(KGPT),该预训练由两个部分组成,1)一种经常知识的生成模型,以生成知识增强的文本。 2)从网络上爬行的大量知识脚本语料库上的预训练范式。可以在各种数据到文本生成任务上对预训练的模型进行微调,以生成特定于任务的文本。我们采用三种设置,即完全监管,零射击,几乎没有射击来评估其有效性。在完全监督的环境下,我们的模型可以比已知的基线获得显着的收益。在零射击设置下,我们的模型没有看到任何示例在WebNLG上实现30多个Rouge-L,而所有其他基线都会失败。在少数拍摄的设置下,我们的模型只需要大约15个标记的示例即可达到与基线模型相同的性能水平。这些实验始终证明了我们提出的框架的强大概括能力https://github.com/wenhuchen/kgpt。
Data-to-text generation has recently attracted substantial interests due to its wide applications. Existing methods have shown impressive performance on an array of tasks. However, they rely on a significant amount of labeled data for each task, which is costly to acquire and thus limits their application to new tasks and domains. In this paper, we propose to leverage pre-training and transfer learning to address this issue. We propose a knowledge-grounded pre-training (KGPT), which consists of two parts, 1) a general knowledge-grounded generation model to generate knowledge-enriched text. 2) a pre-training paradigm on a massive knowledge-grounded text corpus crawled from the web. The pre-trained model can be fine-tuned on various data-to-text generation tasks to generate task-specific text. We adopt three settings, namely fully-supervised, zero-shot, few-shot to evaluate its effectiveness. Under the fully-supervised setting, our model can achieve remarkable gains over the known baselines. Under zero-shot setting, our model without seeing any examples achieves over 30 ROUGE-L on WebNLG while all other baselines fail. Under the few-shot setting, our model only needs about one-fifteenth as many labeled examples to achieve the same level of performance as baseline models. These experiments consistently prove the strong generalization ability of our proposed framework https://github.com/wenhuchen/KGPT.