论文标题

复制器:可靠面部标记检测的改进金字塔变压器

RePFormer: Refinement Pyramid Transformer for Robust Facial Landmark Detection

论文作者

Li, Jinpeng, Jin, Haibo, Liao, Shengcai, Shao, Ling, Heng, Pheng-Ann

论文摘要

本文提出了一个改进金字塔变压器(复制器),用于稳健的面部标志性检测。大多数面部地标探测器都专注于学习代表性图像特征。但是,这些基于CNN的功能表示不足以处理复杂的现实世界情景,因为忽略了地标的内部结构以及地标和上下文之间的关系。在这项工作中,我们将面部地标检测任务制定为沿金字塔记忆的提炼里程碑式查询。具体而言,引入了金字塔变压器头(PTH),以在地标之间建立同源关系,以及地标和跨尺度环境之间的异源关系。此外,动态里程碑改进(DLR)模块旨在将地标回归分解为端到端的细化过程,其中动态聚合的查询转换为残留坐标预测。四个面部标志检测基准及其各种子集的广泛实验结果证明了我们框架的卓越性能和高鲁棒性。

This paper presents a Refinement Pyramid Transformer (RePFormer) for robust facial landmark detection. Most facial landmark detectors focus on learning representative image features. However, these CNN-based feature representations are not robust enough to handle complex real-world scenarios due to ignoring the internal structure of landmarks, as well as the relations between landmarks and context. In this work, we formulate the facial landmark detection task as refining landmark queries along pyramid memories. Specifically, a pyramid transformer head (PTH) is introduced to build both homologous relations among landmarks and heterologous relations between landmarks and cross-scale contexts. Besides, a dynamic landmark refinement (DLR) module is designed to decompose the landmark regression into an end-to-end refinement procedure, where the dynamically aggregated queries are transformed to residual coordinates predictions. Extensive experimental results on four facial landmark detection benchmarks and their various subsets demonstrate the superior performance and high robustness of our framework.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源