论文标题
通过流动和融合多个对象跟踪
Multiple Object Tracking by Flowing and Fusing
论文作者
论文摘要
多数对象跟踪(MOT)方法中,大多数计算两个子任务的单个目标特征:估计目标运动和进行成对重新识别(RE-ID)。由于视频框架之间的目标数量不确定,因此在端到端深度神经网络(DNN)中,这两个子任务都很难有效地扩展。在本文中,我们设计了一种端到端的DNN跟踪方法,即流动跟踪器(FFT),该方法通过两种有效的技术解决了上述问题:目标流动和目标融合。具体而言,在目标流动中,流动型DNN模块从像素级光流中共同学习了无限的目标运动数量。在目标融合中,Fusetracker DNN模块会完善和保险丝目标,而不是FlowTracker和框架对象检测提出的目标,而不是信任目标建议的两个不准确来源之一。由于流动型可以探索复杂的目标运动模式,并且Fusetracker可以从流动式轨道和探测器中提炼和融合目标,因此我们的方法可以在几个MOT基准上实现最新的结果。作为一种在线MOT方法,FFT在2DMOT15,MOT16上产生了46.3的最高MOTA,在MOT16上产生了MOT17跟踪基准的56.5,在现有出版物中超过了所有在线和离线方法。
Most of Multiple Object Tracking (MOT) approaches compute individual target features for two subtasks: estimating target-wise motions and conducting pair-wise Re-Identification (Re-ID). Because of the indefinite number of targets among video frames, both subtasks are very difficult to scale up efficiently in end-to-end Deep Neural Networks (DNNs). In this paper, we design an end-to-end DNN tracking approach, Flow-Fuse-Tracker (FFT), that addresses the above issues with two efficient techniques: target flowing and target fusing. Specifically, in target flowing, a FlowTracker DNN module learns the indefinite number of target-wise motions jointly from pixel-level optical flows. In target fusing, a FuseTracker DNN module refines and fuses targets proposed by FlowTracker and frame-wise object detection, instead of trusting either of the two inaccurate sources of target proposal. Because FlowTracker can explore complex target-wise motion patterns and FuseTracker can refine and fuse targets from FlowTracker and detectors, our approach can achieve the state-of-the-art results on several MOT benchmarks. As an online MOT approach, FFT produced the top MOTA of 46.3 on the 2DMOT15, 56.5 on the MOT16, and 56.5 on the MOT17 tracking benchmarks, surpassing all the online and offline methods in existing publications.