野外视频的多模式情绪估计

论文标题

野外视频的多模式情绪估计

Multi-modal Emotion Estimation for in-the-wild Videos

论文作者

Meng, Liyu, Liu, Yuchen, Liu, Xiaolong, Huang, Zhaopei, Cheng, Yuan, Wang, Meng, Liu, Chuanhe, Jin, Qin

论文摘要

在本文中，我们简要介绍了第三个情感行为分析（ABAW）竞争的价值估计挑战的提交。我们的方法利用了多模式信息，即视觉和音频信息，并采用时间编码器来对视频中的时间上下文进行建模。此外，还采用平滑处理器来获得更合理的预测，并使用模型合奏策略来改善我们提出的方法的性能。实验结果表明，在Aff-Wild2数据集的验证集中，我们的方法可达到价为65.55％的CCC，唤醒的CCC为70.88％CCC，这证明了我们提出的方法的有效性。

In this paper, we briefly introduce our submission to the Valence-Arousal Estimation Challenge of the 3rd Affective Behavior Analysis in-the-wild (ABAW) competition. Our method utilizes the multi-modal information, i.e., the visual and audio information, and employs a temporal encoder to model the temporal context in the videos. Besides, a smooth processor is applied to get more reasonable predictions, and a model ensemble strategy is used to improve the performance of our proposed method. The experiment results show that our method achieves 65.55% ccc for valence and 70.88% ccc for arousal on the validation set of the Aff-Wild2 dataset, which prove the effectiveness of our proposed method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题