论文标题

动作作为移动点

Actions as Moving Points

论文作者

Li, Yixuan, Wang, Zixu, Wang, Limin, Wu, Gangshan

论文摘要

现有的动作小管探测器通常取决于启发式锚定设计和放置,这在计算上可能是昂贵的,因此对于精确定位而言。在本文中,我们通过将动作实例视为移动点的轨迹,介绍了一种概念上简单,计算上有效且更精确的动作小管检测框架,称为MovitCenter检测器(MOC-DETECTOR)。基于以下洞察力,即运动信息可以简化和协助动作管检测,我们的MOC检测器由三个关键的头部分支组成:(1)中心分支为实例中心的检测和动作识别,(2)在相邻框架上进行运动估算以形成运动轨迹的运动估算,以形成运动点的轨迹,(3)通过在每个限制的估算中进行空间范围检测的盒子范围分支,以估算每个估算均匀的估算。这三个分支共同起作用,以生成小管检测结果,可以将其进一步链接到具有匹配策略的视频级管。我们的MOC-DETECTOR优于JHMDB和UCF101-24数据集上帧映射和视频映射指标的现有最新方法。对于更高的视频,表明我们的MOC检测器对于更精确的动作检测特别有效,因此性能差距更为明显。我们在https://github.com/mcg-nju/moc-detector上提供代码。

The existing action tubelet detectors often depend on heuristic anchor design and placement, which might be computationally expensive and sub-optimal for precise localization. In this paper, we present a conceptually simple, computationally efficient, and more precise action tubelet detection framework, termed as MovingCenter Detector (MOC-detector), by treating an action instance as a trajectory of moving points. Based on the insight that movement information could simplify and assist action tubelet detection, our MOC-detector is composed of three crucial head branches: (1) Center Branch for instance center detection and action recognition, (2) Movement Branch for movement estimation at adjacent frames to form trajectories of moving points, (3) Box Branch for spatial extent detection by directly regressing bounding box size at each estimated center. These three branches work together to generate the tubelet detection results, which could be further linked to yield video-level tubes with a matching strategy. Our MOC-detector outperforms the existing state-of-the-art methods for both metrics of frame-mAP and video-mAP on the JHMDB and UCF101-24 datasets. The performance gap is more evident for higher video IoU, demonstrating that our MOC-detector is particularly effective for more precise action detection. We provide the code at https://github.com/MCG-NJU/MOC-Detector.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源