用于域特异性响应生成的元文本变压器

论文标题

用于域特异性响应生成的元文本变压器

Meta-Context Transformers for Domain-Specific Response Generation

论文作者

Kar, Debanjana, Samanta, Suranjana, Azad, Amar Prakash

论文摘要

尽管近年来神经对话模型取得了巨大的成功，但它在产生的响应中缺乏相关性，多样性和连贯性。最近，基于变压器的模型（例如GPT-2）通过通过语言建模来捕获远程结构，彻底改变了对话生成的景观。尽管这些模型表现出极好的语言连贯性，但在用于域特异性响应产生时，它们通常缺乏相关性和术语。在本文中，我们提出了DSRNET（域特定响应网络），这是一种基于变压器的模型，用于通过增强域特异性属性来生成对话响应。特别是，我们从上下文中提取元属性，并将其注入上下文话语，以更好地关注特定领域的关键术语和相关性。我们研究了DSRNET在多转化的多块化环境中的用途用于域特异性响应的产生。在我们的实验中，我们在Ubuntu对话数据集上评估了DSRNET，这些数据集主要由IT域问题的各种技术域与IT域相关的对话以及CAMREST676数据集组成，其中包含餐厅域对话。经过最大似然目标培训，我们的模型在以更好的BLEU和语义相似性（BERTSCORE）分数支持的多转化对话系统的最新对话系统中显示出显着改善。此外，我们还观察到，由于存在特定于域特异性的关键属性，模型产生的响应具有更高的相关性，这些属性与上下文的属性更好地重叠。我们的分析表明，性能的提高主要是由于注入关键术语以及对话，从而更好地关注与域相关的术语。其他促成因素包括与域特异性元属性和主题的对话环境的联合建模。

Despite the tremendous success of neural dialogue models in recent years, it suffers a lack of relevance, diversity, and some times coherence in generated responses. Lately, transformer-based models, such as GPT-2, have revolutionized the landscape of dialogue generation by capturing the long-range structures through language modeling. Though these models have exhibited excellent language coherence, they often lack relevance and terms when used for domain-specific response generation. In this paper, we present DSRNet (Domain Specific Response Network), a transformer-based model for dialogue response generation by reinforcing domain-specific attributes. In particular, we extract meta attributes from context and infuse them with the context utterances for better attention over domain-specific key terms and relevance. We study the use of DSRNet in a multi-turn multi-interlocutor environment for domain-specific response generation. In our experiments, we evaluate DSRNet on Ubuntu dialogue datasets, which are mainly composed of various technical domain related dialogues for IT domain issue resolutions and also on CamRest676 dataset, which contains restaurant domain conversations. Trained with maximum likelihood objective, our model shows significant improvement over the state-of-the-art for multi-turn dialogue systems supported by better BLEU and semantic similarity (BertScore) scores. Besides, we also observe that the responses produced by our model carry higher relevance due to the presence of domain-specific key attributes that exhibit better overlap with the attributes of the context. Our analysis shows that the performance improvement is mostly due to the infusion of key terms along with dialogues which result in better attention over domain-relevant terms. Other contributing factors include joint modeling of dialogue context with the domain-specific meta attributes and topics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题