论文标题
通过互联卷积神经网络进行端到端的面部解析
End-to-End Face Parsing via Interlinked Convolutional Neural Networks
论文作者
论文摘要
面部解析是一项重要的计算机视觉任务,需要对面部零件(例如眼睛,鼻子,嘴等)进行准确的像素分割,为进一步的面部分析,修改和其他应用提供了基础。事实证明,相互链接的卷积神经网络(ICNN)被证明是面部解析的有效的两阶段模型。但是,原始ICNN在两个阶段进行了分别训练,从而限制了其性能。为了解决这个问题,我们引入了一个简单的端到端面部解析框架:STN辅助ICNN(STN-ICNN),该框架通过在两个隔离阶段之间添加空间变压器网络(STN)来扩展ICNN。 STN-ICNN使用STN为原始的两阶段ICNN管道提供可训练的连接,从而使端到端的联合培训成为可能。此外,作为副产品,STN还提供了比原始农作物更精确的裁剪零件。由于这两个优点,我们的方法显着提高了原始模型的准确性。我们的模型在Helen数据集(标准面对解析数据集)上实现了竞争性能。它还在Celebamask-HQ数据集上取得了出色的性能,证明了其良好的概括。我们的代码已在https://github.com/aod321/stn-icnn上发布。
Face parsing is an important computer vision task that requires accurate pixel segmentation of facial parts (such as eyes, nose, mouth, etc.), providing a basis for further face analysis, modification, and other applications. Interlinked Convolutional Neural Networks (iCNN) was proved to be an effective two-stage model for face parsing. However, the original iCNN was trained separately in two stages, limiting its performance. To solve this problem, we introduce a simple, end-to-end face parsing framework: STN-aided iCNN(STN-iCNN), which extends the iCNN by adding a Spatial Transformer Network (STN) between the two isolated stages. The STN-iCNN uses the STN to provide a trainable connection to the original two-stage iCNN pipeline, making end-to-end joint training possible. Moreover, as a by-product, STN also provides more precise cropped parts than the original cropper. Due to these two advantages, our approach significantly improves the accuracy of the original model. Our model achieved competitive performance on the Helen Dataset, the standard face parsing dataset. It also achieved superior performance on CelebAMask-HQ dataset, proving its good generalization. Our code has been released at https://github.com/aod321/STN-iCNN.