论文标题

GM-TCNET:使用情感因果关系进行言语情感识别

GM-TCNet: Gated Multi-scale Temporal Convolutional Network using Emotion Causality for Speech Emotion Recognition

论文作者

Ye, Jia-Xin, Wen, Xin-Cheng, Wang, Xuan-Ze, Xu, Yong, Luo, Yan, Wu, Chang-Li, Chen, Li-Yan, Liu, Kun-Hong

论文摘要

在人类计算机的互动中,语音情感识别(SER)在理解用户的意图和改善互动体验方面起着至关重要的作用。尽管类似的情感演讲具有多样的说话者特征,但具有共同的先例和后果,但SER的基本挑战是如何通过语音情绪之间的因果关系来产生强大而歧视性的表述。在本文中,我们提出了一个封闭式的多尺度时间卷积网络(GM-TCNET),以构建具有多尺度接收领域的新型情感因果关系学习组件。 GM-TCNET部署了一种新型的情感因果关系来形式学习组件,以捕获整个时域情绪动态,并以扩张的因果卷积层和门控机制构建。此外,它利用了跳过连接,从而融合了来自不同封闭式卷积块的高级特征,以捕捉人类言语中丰富而微妙的情绪变化。 GM-TCNET首先使用单一类型的特征,MEL频率Cepstral系数作为输入,然后将它们通过门控的时间卷积模块传递以生成高级特征。最后,这些功能被馈送到情感分类器中以完成SER任务。实验结果表明,与最新技术相比,我们的模型在大多数情况下保持最高的性能。

In human-computer interaction, Speech Emotion Recognition (SER) plays an essential role in understanding the user's intent and improving the interactive experience. While similar sentimental speeches own diverse speaker characteristics but share common antecedents and consequences, an essential challenge for SER is how to produce robust and discriminative representations through causality between speech emotions. In this paper, we propose a Gated Multi-scale Temporal Convolutional Network (GM-TCNet) to construct a novel emotional causality representation learning component with a multi-scale receptive field. GM-TCNet deploys a novel emotional causality representation learning component to capture the dynamics of emotion across the time domain, constructed with dilated causal convolution layer and gating mechanism. Besides, it utilizes skip connection fusing high-level features from different gated convolution blocks to capture abundant and subtle emotion changes in human speech. GM-TCNet first uses a single type of feature, mel-frequency cepstral coefficients, as inputs and then passes them through the gated temporal convolutional module to generate the high-level features. Finally, the features are fed to the emotion classifier to accomplish the SER task. The experimental results show that our model maintains the highest performance in most cases compared to state-of-the-art techniques.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源