论文标题
知识增强文本生成的调查
A Survey of Knowledge-Enhanced Text Generation
论文作者
论文摘要
文本生成的目的是使机器用人类语言表达。它是自然语言处理(NLP)中最重要但最具挑战性的任务之一。自2014年以来,已经提出了通过学习将输入文本映射到输出文本来实现目标的各种神经编码器模型。但是,仅输入文本通常提供有限的知识来生成所需的输出,因此在许多实际情况下,文本生成的性能仍然远非满意度。为了解决这个问题,研究人员考虑了将输入文本以外的各种形式的知识纳入一代模型。该研究方向被称为知识增强的文本生成。在这项调查中,我们对过去五年来对知识增强文本生成的研究的研究进行了全面综述。主要内容包括两个部分:(i)将知识集成到文本生成中的一般方法和架构; (ii)根据不同形式的知识数据的特定技术和应用。这项调查可以在学术界和行业中拥有广泛的受众,研究人员和从业人员。
The goal of text generation is to make machines express in human language. It is one of the most important yet challenging tasks in natural language processing (NLP). Since 2014, various neural encoder-decoder models pioneered by Seq2Seq have been proposed to achieve the goal by learning to map input text to output text. However, the input text alone often provides limited knowledge to generate the desired output, so the performance of text generation is still far from satisfaction in many real-world scenarios. To address this issue, researchers have considered incorporating various forms of knowledge beyond the input text into the generation models. This research direction is known as knowledge-enhanced text generation. In this survey, we present a comprehensive review of the research on knowledge enhanced text generation over the past five years. The main content includes two parts: (i) general methods and architectures for integrating knowledge into text generation; (ii) specific techniques and applications according to different forms of knowledge data. This survey can have broad audiences, researchers and practitioners, in academia and industry.