MEDDG：一个以实体为中心的医学咨询数据集，用于实体感知医学对话生成

论文标题

MEDDG：一个以实体为中心的医学咨询数据集，用于实体感知医学对话生成

MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware Medical Dialogue Generation

论文作者

Liu, Wenge, Tang, Jianheng, Cheng, Yi, Li, Wenjie, Zheng, Yefeng, Liang, Xiaodan

论文摘要

开发会话剂与患者相互作用并提供主要的临床建议，由于其巨大的应用潜力引起了人们的关注，尤其是在Covid-19- 19日大流行时期。但是，端到端神经对话系统的培训受到数量不足的医学对话语料库的限制。在这项工作中，我们首次尝试建立和发布与12种常见的胃肠道疾病相关的大规模高质量医学对话数据集，名为MEDDG，并从在线健康咨询社区收集了超过17K的对话。在MEDDG的每次对话中，都会注释五种不同类别的实体，包括疾病，症状，属性，测试和药物。为了推动对建立专家敏感的医学对话系统的未来研究，我们提出了基于MEDDG数据集的两种医疗对话任务。一个是下一个实体预测，另一个是医生反应的产生。为了明确理解这两个医疗对话任务，我们实施了几个最新的基准，并设计了两个对话模型，并进一步考虑了预测的实体。实验结果表明，在我们数据集中的性能较差的两项任务上，训练前语言模型和其他基准都在努力，并且可以在辅助实体信息的帮助下增强响应质量。从人类评估来看，简单的检索模型的表现优于几个最先进的生成模型，这表明仍然有很大的改进空间可以改善产生有意义的反应。

Developing conversational agents to interact with patients and provide primary clinical advice has attracted increasing attention due to its huge application potential, especially in the time of COVID-19 Pandemic. However, the training of end-to-end neural-based medical dialogue system is restricted by an insufficient quantity of medical dialogue corpus. In this work, we make the first attempt to build and release a large-scale high-quality Medical Dialogue dataset related to 12 types of common Gastrointestinal diseases named MedDG, with more than 17K conversations collected from the online health consultation community. Five different categories of entities, including diseases, symptoms, attributes, tests, and medicines, are annotated in each conversation of MedDG as additional labels. To push forward the future research on building expert-sensitive medical dialogue system, we proposes two kinds of medical dialogue tasks based on MedDG dataset. One is the next entity prediction and the other is the doctor response generation. To acquire a clear comprehension on these two medical dialogue tasks, we implement several state-of-the-art benchmarks, as well as design two dialogue models with a further consideration on the predicted entities. Experimental results show that the pre-train language models and other baselines struggle on both tasks with poor performance in our dataset, and the response quality can be enhanced with the help of auxiliary entity information. From human evaluation, the simple retrieval model outperforms several state-of-the-art generative models, indicating that there still remains a large room for improvement on generating medically meaningful responses.

下载PDF全文

下载文献需遵守相关版权规定

论文标题