论文标题
mmdialog:大规模多转向对话数据集朝向多模式开放域对话
MMDialog: A Large-scale Multi-turn Dialogue Dataset Towards Multi-modal Open-domain Conversation
论文作者
论文摘要
响应多模式内容已被认为是智能对话代理的重要能力。在本文中,我们介绍了MMDialog数据集,以更好地促进多模式对话。 MMDialog由一组精选的108万个现实世界对话组成,其中有153万个独特的图像在4,184个主题中。 mmdialog具有两个主要和独特的优势。首先,它是最大的多模式对话数据集,按对话数量为88倍。其次,它包含大量主题来概括开放域。为了使用此数据集构建引人入胜的对话系统,我们根据检索和生成方案提出并标准化了两个响应的任务。此外,我们还使用最先进的技术来为上述任务建造两个基准,并报告其实验性能。我们还提出了一种新型的评估度量MM率,以衡量多模式响应。我们的数据集和脚本可在https://github.com/victorsungo/mmdialog中找到。
Responding with multi-modal content has been recognized as an essential capability for an intelligent conversational agent. In this paper, we introduce the MMDialog dataset to better facilitate multi-modal conversation. MMDialog is composed of a curated set of 1.08 million real-world dialogues with 1.53 million unique images across 4,184 topics. MMDialog has two main and unique advantages. First, it is the largest multi-modal conversation dataset by the number of dialogues by 88x. Second, it contains massive topics to generalize the open-domain. To build engaging dialogue system with this dataset, we propose and normalize two response producing tasks based on retrieval and generative scenarios. In addition, we build two baselines for above tasks with state-of-the-art techniques and report their experimental performance. We also propose a novel evaluation metric MM-Relevance to measure the multi-modal responses. Our dataset and scripts are available in https://github.com/victorsungo/MMDialog.