论文标题
PCLS:3D姿势的几何感知神经重建,透视作物层
PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers
论文作者
论文摘要
本地处理是CNN和其他神经网络体系结构的重要特征 - 这是它们在相关信息在很大程度上局部的图像上如此出色的原因之一。但是,由于图像中不同的全局位置而导致的透视效果因传统摄像头的投影而异。我们介绍了透视作物层(PCLS) - 基于摄像机几何形状的感兴趣区域的一种透视作物形式 - 并表明对视角的核算始终提高了最先进的3D姿势重建方法的准确性。 PCL是模块化神经网络层,当将其插入现有的CNN和MLP体系结构中时,确定性地删除了依赖位置的透视效果,同时留下端到端训练以及基础神经网络的参数数量不变。我们证明,PCL可改善使用使用农作物操作的CNN体系结构(例如空间变压器网络(STN))的3D人姿势重建精度,并且有些令人惊讶的是,用于2D至3D KePoint提升的MLP。我们的结论是,对于基于经典和深度学习的计算机视觉,使用相机校准信息很重要。 PCL提供了一种简单的方法来通过使它们的几何形状意识到现有3D重建网络的准确性。我们的代码可在github.com/yu-frank/perspectivecroplayers上公开获取。
Local processing is an essential feature of CNNs and other neural network architectures - it is one of the reasons why they work so well on images where relevant information is, to a large extent, local. However, perspective effects stemming from the projection in a conventional camera vary for different global positions in the image. We introduce Perspective Crop Layers (PCLs) - a form of perspective crop of the region of interest based on the camera geometry - and show that accounting for the perspective consistently improves the accuracy of state-of-the-art 3D pose reconstruction methods. PCLs are modular neural network layers, which, when inserted into existing CNN and MLP architectures, deterministically remove the location-dependent perspective effects while leaving end-to-end training and the number of parameters of the underlying neural network unchanged. We demonstrate that PCL leads to improved 3D human pose reconstruction accuracy for CNN architectures that use cropping operations, such as spatial transformer networks (STN), and, somewhat surprisingly, MLPs used for 2D-to-3D keypoint lifting. Our conclusion is that it is important to utilize camera calibration information when available, for classical and deep-learning-based computer vision alike. PCL offers an easy way to improve the accuracy of existing 3D reconstruction networks by making them geometry aware. Our code is publicly available at github.com/yu-frank/PerspectiveCropLayers.