论文标题
SS-VAERR:视频中的自我监督明显的情感反应识别
SS-VAERR: Self-Supervised Apparent Emotional Reaction Recognition from Video
论文作者
论文摘要
这项工作着重于以自我监督的方式进行的仅视频输入的明显情感反应识别(AERR)。该网络首先是在不同的自我监督借口任务上进行的,然后在下游目标任务上进行了微调。自我监督的学习有助于使用预先训练的架构和较大的数据集,这些数据集可能不适合目标任务,但对于学习信息性表示的表示,并提供了有用的初始化,以进一步对较小的更合适的数据进行进一步的微调。我们提出的贡献是两个方面:(1)对仅视频明显的情绪反应识别架构的不同最新(SOTA)借口任务的分析,以及(2)对回归和分类损失的各种组合的分析,这些损失可能会进一步改善绩效。这两个贡献共同导致了仅通过连续注释的仅视频自发的明显情绪反应识别的当前最新性能。
This work focuses on the apparent emotional reaction recognition (AERR) from the video-only input, conducted in a self-supervised fashion. The network is first pre-trained on different self-supervised pretext tasks and later fine-tuned on the downstream target task. Self-supervised learning facilitates the use of pre-trained architectures and larger datasets that might be deemed unfit for the target task and yet might be useful to learn informative representations and hence provide useful initializations for further fine-tuning on smaller more suitable data. Our presented contribution is two-fold: (1) an analysis of different state-of-the-art (SOTA) pretext tasks for the video-only apparent emotional reaction recognition architecture, and (2) an analysis of various combinations of the regression and classification losses that are likely to improve the performance further. Together these two contributions result in the current state-of-the-art performance for the video-only spontaneous apparent emotional reaction recognition with continuous annotations.
