论文标题
无监督的3D人姿势代表与观点和姿势分离
Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement
论文作者
论文摘要
学习一个良好的3D人体姿势表示对于人类姿势相关的任务很重要,例如人类3D姿势估计和行动识别。在所有这些问题中,保留固有的姿势信息并适应观察变化是两个关键问题。在这项工作中,我们提出了一种新颖的暹罗脱糖器自动编码器,以完全无需监督的方式将姿势依赖性和观点依赖性特征从人类骨架数据中删除,以学习3D姿势表示。这两个分离的特征被一起用作3D姿势的表示。为了考虑运动学和几何依赖性,进一步提出了一个顺序双向递归网络(Sebirenet)来对人骨架数据进行建模。广泛的实验表明,学习的表示1)保留人姿势的内在信息,2)在数据集和任务之间显示出良好的可传递性。值得注意的是,我们的方法在两个固有的任务上实现了最新的表现:姿势降解和无监督的行动识别。代码和型号可在:\ url {https://github.com/nieqiang001/unsuperpersevise-human-pose.git.git}中获得。
Learning a good 3D human pose representation is important for human pose related tasks, e.g. human 3D pose estimation and action recognition. Within all these problems, preserving the intrinsic pose information and adapting to view variations are two critical issues. In this work, we propose a novel Siamese denoising autoencoder to learn a 3D pose representation by disentangling the pose-dependent and view-dependent feature from the human skeleton data, in a fully unsupervised manner. These two disentangled features are utilized together as the representation of the 3D pose. To consider both the kinematic and geometric dependencies, a sequential bidirectional recursive network (SeBiReNet) is further proposed to model the human skeleton data. Extensive experiments demonstrate that the learned representation 1) preserves the intrinsic information of human pose, 2) shows good transferability across datasets and tasks. Notably, our approach achieves state-of-the-art performance on two inherently different tasks: pose denoising and unsupervised action recognition. Code and models are available at: \url{https://github.com/NIEQiang001/unsupervised-human-pose.git}