论文标题
跟踪的联合特征学习和关系建模:一个单流框架
Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework
论文作者
论文摘要
当前流行的两流阶段跟踪框架提取了模板,搜索区域分别具有特征,然后执行关系建模,因此提取的特征缺乏对目标的认识,并且目标背景的可区分性有限。为了解决上述问题,我们提出了一个新颖的一流跟踪(OSTRACK)框架,该框架通过与双向信息流构建模板搜索图像对来统一特征学习和关系建模。这样,可以通过相互指导动态提取歧视性目标的特征。由于不需要额外的重型建模模块,并且实现高度平行,因此提出的跟踪器以快速运行。为了进一步提高推论效率,根据一流框架中计算出的强大相似性提出了网络内候选早期消除模块。作为一个统一的框架,Ostrack在多个基准上实现了最先进的性能,特别是,它在单次跟踪基准GOT-10K上显示出令人印象深刻的结果,即获得73.7%的AO,将现有最佳结果(Swintrack)提高了4.3 \%。此外,我们的方法保持了良好的性能速度权衡,并显示出更快的融合。代码和型号可在https://github.com/botaoye/ostrack上找到。
The current popular two-stream, two-stage tracking framework extracts the template and the search region features separately and then performs relation modeling, thus the extracted features lack the awareness of the target and have limited target-background discriminability. To tackle the above issue, we propose a novel one-stream tracking (OSTrack) framework that unifies feature learning and relation modeling by bridging the template-search image pairs with bidirectional information flows. In this way, discriminative target-oriented features can be dynamically extracted by mutual guidance. Since no extra heavy relation modeling module is needed and the implementation is highly parallelized, the proposed tracker runs at a fast speed. To further improve the inference efficiency, an in-network candidate early elimination module is proposed based on the strong similarity prior calculated in the one-stream framework. As a unified framework, OSTrack achieves state-of-the-art performance on multiple benchmarks, in particular, it shows impressive results on the one-shot tracking benchmark GOT-10k, i.e., achieving 73.7% AO, improving the existing best result (SwinTrack) by 4.3\%. Besides, our method maintains a good performance-speed trade-off and shows faster convergence. The code and models are available at https://github.com/botaoye/OSTrack.