论文标题
面部动作单位通过自适应注意和关系检测
Facial Action Unit Detection via Adaptive Attention and Relation
论文作者
论文摘要
由于难以从微妙和动态的AU中捕获相关信息,因此面部动作单元(AU)检测很具有挑战性。现有的方法通常诉诸于AUS相关区域的定位,在该区域中,通过相关的面部地标通过相关的面部地标的局部AU浓度通常会丢弃基本部分,或者学习全球关注图通常包含无关的区域。此外,现有的关系推理方法通常对所有AU都采用共同的模式,同时忽略每个AU的特定方式。为了应对这些局限性,我们提出了一个新颖的自适应注意力和关系(AAR)框架,以进行面部AU检测。具体而言,我们提出了一个自适应注意回归网络,以在注意力定义的限制和AU检测的指导下回归每个AU的全局注意力图,这对于在弱相关区域和弱相关区域中具有强度相关区域的地标和面部分布式的依赖性捕获两者是有益的。此外,考虑到AUS的多样性和动力学,我们提出了一个自适应时空图形卷积网络,以同时推荐每个AU的独立模式,AUS之间的相互依赖性以及时间依赖性。广泛的实验表明,我们的方法(i)在不受约束的场景中在受约束的情况和aft-wild2中在包括BP4D,DISFA和GFT在内的具有挑战性的基准上取得了竞争性能,并且(ii)可以精确地学习每个AU的区域相关性分布。
Facial action unit (AU) detection is challenging due to the difficulty in capturing correlated information from subtle and dynamic AUs. Existing methods often resort to the localization of correlated regions of AUs, in which predefining local AU attentions by correlated facial landmarks often discards essential parts, or learning global attention maps often contains irrelevant areas. Furthermore, existing relational reasoning methods often employ common patterns for all AUs while ignoring the specific way of each AU. To tackle these limitations, we propose a novel adaptive attention and relation (AAR) framework for facial AU detection. Specifically, we propose an adaptive attention regression network to regress the global attention map of each AU under the constraint of attention predefinition and the guidance of AU detection, which is beneficial for capturing both specified dependencies by landmarks in strongly correlated regions and facial globally distributed dependencies in weakly correlated regions. Moreover, considering the diversity and dynamics of AUs, we propose an adaptive spatio-temporal graph convolutional network to simultaneously reason the independent pattern of each AU, the inter-dependencies among AUs, as well as the temporal dependencies. Extensive experiments show that our approach (i) achieves competitive performance on challenging benchmarks including BP4D, DISFA, and GFT in constrained scenarios and Aff-Wild2 in unconstrained scenarios, and (ii) can precisely learn the regional correlation distribution of each AU.