MM-DFN：在对话中识别情感的多模式动态融合网络

论文标题

MM-DFN：在对话中识别情感的多模式动态融合网络

MM-DFN: Multimodal Dynamic Fusion Network for Emotion Recognition in Conversations

论文作者

Hu, Dou, Hou, Xiaolong, Wei, Lingwei, Jiang, Lianxin, Mo, Yang

论文摘要

对话中的情绪识别（ERC）具有开发善解人意的机器的相当前景。对于多模式ERC，了解对话中的上下文和融合模式信息至关重要。最新的基于图的融合方法通常通过探索图中的单峰和跨模式相互作用来汇总多模式信息。但是，它们在每一层积累了冗余信息，从而限制了模式之间的上下文理解。在本文中，我们提出了一种新型的多模式动态融合网络（MM-DFN），以通过完全理解多模式对话环境来识别情绪。具体来说，我们设计了一个新的基于图的动态融合模块，以融合对话中的多模式上下文特征。该模块通过捕获不同语义空间中上下文信息的动态来降低冗余并增强模态之间的互补性。在两个公共基准数据集上进行的广泛实验证明了MM-DFN的有效性和优势。

Emotion Recognition in Conversations (ERC) has considerable prospects for developing empathetic machines. For multimodal ERC, it is vital to understand context and fuse modality information in conversations. Recent graph-based fusion methods generally aggregate multimodal information by exploring unimodal and cross-modal interactions in a graph. However, they accumulate redundant information at each layer, limiting the context understanding between modalities. In this paper, we propose a novel Multimodal Dynamic Fusion Network (MM-DFN) to recognize emotions by fully understanding multimodal conversational context. Specifically, we design a new graph-based dynamic fusion module to fuse multimodal contextual features in a conversation. The module reduces redundancy and enhances complementarity between modalities by capturing the dynamics of contextual information in different semantic spaces. Extensive experiments on two public benchmark datasets demonstrate the effectiveness and superiority of MM-DFN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题