论文标题
识别人类四肢自由选择的关键点
Recognition of Freely Selected Keypoints on Human Limbs
论文作者
论文摘要
几乎所有人类姿势估计(HPE)数据集由固定的关键点组成。在此类数据集上训练的标准HPE模型只能检测到这些关键点。如果需要更多的点,则必须手动注释它们,并且需要重新训练。我们的方法利用了视觉变压器体系结构扩展模型检测人员四肢的任意关键的能力。我们提出了两种不同的方法来编码所需的关键点。 (1)每个关键点由沿两个封闭关键点之间的界线的位置定义,从固定集合及其肢体边缘之间的相对距离。 (2)关键点定义为标准姿势上的坐标。两种方法均基于令牌架构,而与固定关键点相对应的关键点代币被我们的新型模块代替。实验表明,我们的方法与固定关键点上的tokenpose取得了相似的结果,并且能够检测四肢的任意关键。
Nearly all Human Pose Estimation (HPE) datasets consist of a fixed set of keypoints. Standard HPE models trained on such datasets can only detect these keypoints. If more points are desired, they have to be manually annotated and the model needs to be retrained. Our approach leverages the Vision Transformer architecture to extend the capability of the model to detect arbitrary keypoints on the limbs of persons. We propose two different approaches to encode the desired keypoints. (1) Each keypoint is defined by its position along the line between the two enclosing keypoints from the fixed set and its relative distance between this line and the edge of the limb. (2) Keypoints are defined as coordinates on a norm pose. Both approaches are based on the TokenPose architecture, while the keypoint tokens that correspond to the fixed keypoints are replaced with our novel module. Experiments show that our approaches achieve similar results to TokenPose on the fixed keypoints and are capable of detecting arbitrary keypoints on the limbs.