论文标题
与像素对准非参数手工网重建
Pixel-Aligned Non-parametric Hand Mesh Reconstruction
论文作者
论文摘要
非参数网格重建最近在3D手和身体应用中显示出显着的进展。在这些方法中,神经网络可见网格顶点和边缘,从而有可能在2D图像像素和3D网格顶点之间建立直接映射。在本文中,我们试图通过简单而紧凑的体系结构来建立和利用此映射。该网络的设计具有以下考虑因素:1)汇总编码器的本地2D图像特征和网格解码器中捕获的3D几何特征; 2)沿解码层解码粗到十字的网格,以充分利用分层多尺度信息。具体来说,我们为手网格恢复任务提出了一条端到端管道,该任务由三个阶段组成:一个2D功能提取器构造多尺度特征图,一个特征映射模块将本地2D图像特征转换为3D顶点特征,通过3D到2D投影,以及将图形互动和自我自我结合到ReconStent Mesh的网状解码器。解码器在顶点中汇总了像素和几何特征的本地图像特征。它还以粗略的方式回归网格顶点,可以利用多尺度信息。通过利用本地连接并设计网格解码器,我们的方法可以在公共Freihand数据集中实现最新的手工网格重建。
Non-parametric mesh reconstruction has recently shown significant progress in 3D hand and body applications. In these methods, mesh vertices and edges are visible to neural networks, enabling the possibility to establish a direct mapping between 2D image pixels and 3D mesh vertices. In this paper, we seek to establish and exploit this mapping with a simple and compact architecture. The network is designed with these considerations: 1) aggregating both local 2D image features from the encoder and 3D geometric features captured in the mesh decoder; 2) decoding coarse-to-fine meshes along the decoding layers to make the best use of the hierarchical multi-scale information. Specifically, we propose an end-to-end pipeline for hand mesh recovery tasks which consists of three phases: a 2D feature extractor constructing multi-scale feature maps, a feature mapping module transforming local 2D image features to 3D vertex features via 3D-to-2D projection, and a mesh decoder combining the graph convolution and self-attention to reconstruct mesh. The decoder aggregate both local image features in pixels and geometric features in vertices. It also regresses the mesh vertices in a coarse-to-fine manner, which can leverage multi-scale information. By exploiting the local connection and designing the mesh decoder, Our approach achieves state-of-the-art for hand mesh reconstruction on the public FreiHAND dataset.