视觉变压器配备了神经重新者面部表达识别任务

论文标题

视觉变压器配备了神经重新者面部表达识别任务

Vision Transformer Equipped with Neural Resizer on Facial Expression Recognition Task

论文作者

Hwang, Hyeonbin, Kim, Soyeon, Park, Wei-Jin, Seo, Jiho, Ko, Kyungtae, Yeo, Hyeon

论文摘要

在野外条件下，面部表情识别通常会受到低质量数据和不平衡，模棱两可的标签的挑战。该领域从基于CNN的方法中受益匪浅。但是，CNN模型具有结构限制，可以看到面部区域处于遥远的状态。作为一种补救措施，已将变压器引入了具有全球接收场的视觉场，但需要将输入空间尺寸调整为验证的模型，以享受其强烈的感应偏见。我们在这里提出了一个问题，使用确定性插值方法是否足以将低分辨率数据馈送到变压器。在这项工作中，我们提出了一个新颖的培训框架，即神经Resizer，以补偿信息并以数据驱动的方式进行缩减，以损失功能平衡噪音和失衡。实验表明，使用F-PDLS损耗函数的神经恢复器可以通过一般变压器变体提高性能，并且几乎可以实现最新的性能。

When it comes to wild conditions, Facial Expression Recognition is often challenged with low-quality data and imbalanced, ambiguous labels. This field has much benefited from CNN based approaches; however, CNN models have structural limitation to see the facial regions in distant. As a remedy, Transformer has been introduced to vision fields with global receptive field, but requires adjusting input spatial size to the pretrained models to enjoy their strong inductive bias at hands. We herein raise a question whether using the deterministic interpolation method is enough to feed low-resolution data to Transformer. In this work, we propose a novel training framework, Neural Resizer, to support Transformer by compensating information and downscaling in a data-driven manner trained with loss function balancing the noisiness and imbalance. Experiments show our Neural Resizer with F-PDLS loss function improves the performance with Transformer variants in general and nearly achieves the state-of-the-art performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题