通过语言模型将知识注入对话生成

论文标题

通过语言模型将知识注入对话生成

Knowledge Injection into Dialogue Generation via Language Models

论文作者

Tuan, Yi-Lin, Wei, Wei, Wang, William Yang

论文摘要

在许多对话中，对话的产生已经通过神经网络从头开始学习，但倾向于产生相同的一般反应，例如“您在说什么？”。为了减少这种同质性，将外部知识（例如说话者的概况和域知识）应用为多样化模型输出的附加条件。但是，开展有效对话的所需知识并不总是可用的，这与先前工作的假设不同，即模型在聊天之前始终获得了足够的知识。当应用这样的对话模型与不受限制的人和主题在线聊天时，此问题可能会有害，因为该模型没有所需的知识。为了解决这个问题，我们提出了Injk，这是将知识注入对话生成模型的两阶段方法。首先，我们训练大型语言模型并将其查询为文本知识。其次，我们将对话生成模型构架以顺序生成文本知识和相应的响应。从经验上讲，当对话生成模型只能访问有限的知识时，我们的方法通过产生更连贯和信息丰富的响应来优于先前的工作。

Dialogue generation has been successfully learned from scratch by neural networks, but tends to produce the same general response, e.g., "what are you talking about?", in many conversations. To reduce this homogeneity, external knowledge such as the speaker's profile and domain knowledge is applied as an additional condition to diversify a model's output. The required knowledge to develop an effective conversation, however, is not always available, which is different from prior work's assumption that a model always has acquired sufficient knowledge before chatting. This problem can be detrimental when applying a dialogue model like this chatting online with unconstrained people and topics, because the model does not have the needed knowledge. To address this problem, we propose InjK, which is a two-stage approach to inject knowledge into a dialogue generation model. First, we train a large-scale language model and query it as textual knowledge. Second, we frame a dialogue generation model to sequentially generate textual knowledge and a corresponding response. Empirically, when a dialogue generation model can only access limited knowledge, our method outperforms prior work by producing more coherent and informative responses.

下载PDF全文

下载文献需遵守相关版权规定

论文标题