论文标题
参加对象:基于自我注意力的网络,用于预测电影的情感反应
AttendAffectNet: Self-Attention based Networks for Predicting Affective Responses from Movies
论文作者
论文摘要
在这项工作中,我们提出了基于自我注意力的网络的不同变体,以从电影中进行情感预测,我们称之为atpationAffectnet。我们同时考虑了音频和视频,并通过以新颖的方式将自我注意的机制应用于情感预测的提取特征中,从而在多种方式之间结合了关系。我们将其与基于自我注意力的模型的典型时间整合进行比较,在我们的情况下,该模型允许捕获电影的时间表示的关系,同时考虑情绪响应的顺序依赖性。我们证明了我们提出的体系结构对扩展的Cognimuse数据集[1],[2]和中世纪2016年电影任务的情感影响[3]的有效性[3],该影响包括带有情感注释的电影。我们的结果表明,在不同的视听特征(而不是时间域)上应用自我注意力的机制对情绪预测更有效。我们的方法也被证明超过了情感预测的许多最先进的模型。通过模型的实现来重现我们的结果的代码,请访问:https://github.com/ivyha010/attendaffectnet。
In this work, we propose different variants of the self-attention based network for emotion prediction from movies, which we call AttendAffectNet. We take both audio and video into account and incorporate the relation among multiple modalities by applying self-attention mechanism in a novel manner into the extracted features for emotion prediction. We compare it to the typically temporal integration of the self-attention based model, which in our case, allows to capture the relation of temporal representations of the movie while considering the sequential dependencies of emotion responses. We demonstrate the effectiveness of our proposed architectures on the extended COGNIMUSE dataset [1], [2] and the MediaEval 2016 Emotional Impact of Movies Task [3], which consist of movies with emotion annotations. Our results show that applying the self-attention mechanism on the different audio-visual features, rather than in the time domain, is more effective for emotion prediction. Our approach is also proven to outperform many state-ofthe-art models for emotion prediction. The code to reproduce our results with the models' implementation is available at: https://github.com/ivyha010/AttendAffectNet.