论文标题
基于视频的人姿势估计的运动学建模网络
Kinematics Modeling Network for Video-based Human Pose Estimation
论文作者
论文摘要
从视频中估算人的姿势对于人类计算机相互作用至关重要。关节在人类运动期间合作而不是独立移动。关节之间既有空间和时间相关性。尽管以前的方法取得了积极的结果,但大多数关注的是建模关节之间的空间相关性,而仅直接整合沿时间维度的特征,而忽略了关节之间的时间相关性。在这项工作中,我们建议通过计算其时间相似性来明确地对关节之间的时间相关性进行明确模拟关节之间的时间相关性。这样,KMM可以在不同时间内捕获当前关节的运动提示。此外,我们将基于视频的人类姿势估计作为马尔可夫决策过程,并设计一种新型的运动学建模网络(KIMNET)来模拟马尔可夫链,从而使Kimnet可以递归定位关节。我们的方法在两个具有挑战性的基准上实现了最新的结果。特别是,金奈特表现出对闭塞的鲁棒性。该代码将在https://github.com/yhdang/kimnet上发布。
Estimating human poses from videos is critical in human-computer interaction. Joints cooperate rather than move independently during human movement. There are both spatial and temporal correlations between joints. Despite the positive results of previous approaches, most focus on modeling the spatial correlation between joints while only straightforwardly integrating features along the temporal dimension, ignoring the temporal correlation between joints. In this work, we propose a plug-and-play kinematics modeling module (KMM) to explicitly model temporal correlations between joints across different frames by calculating their temporal similarity. In this way, KMM can capture motion cues of the current joint relative to all joints in different time. Besides, we formulate video-based human pose estimation as a Markov Decision Process and design a novel kinematics modeling network (KIMNet) to simulate the Markov Chain, allowing KIMNet to locate joints recursively. Our approach achieves state-of-the-art results on two challenging benchmarks. In particular, KIMNet shows robustness to the occlusion. The code will be released at https://github.com/YHDang/KIMNet.