视频共定位的多个提示的外观融合

论文标题

视频共定位的多个提示的外观融合

Appearance Fusion of Multiple Cues for Video Co-localization

论文作者

Jerripothula, Koteswar Rao

论文摘要

这项工作在使用多个与对象相关的线索的同时解决了视频中的联合对象发现问题。与通常的空间融合方法相反，这里提出了一种新颖的外观融合方法。具体而言，本文提出了从多个线索衍生成一个GMM的不同GMM的有效融合过程。与任何融合策略一样，这种方法也需要一些指导。提出的方法依赖于可靠性和共识现象来指导。作为案例研究，我们追求“视频共定位”对象发现问题，以提出我们的方法论。我们在YouTube对象和YouTube共定位数据集上的实验表明，所提出的外观融合方法无疑比空间融合策略和当前最新视频共定位方法具有优势。

This work addresses the joint object discovery problem in videos while utilizing multiple object-related cues. In contrast to the usual spatial fusion approach, a novel appearance fusion approach is presented here. Specifically, this paper proposes an effective fusion process of different GMMs derived from multiple cues into one GMM. Much the same as any fusion strategy, this approach also needs some guidance. The proposed method relies on reliability and consensus phenomenon for guidance. As a case study, we pursue the "video co-localization" object discovery problem to propose our methodology. Our experiments on YouTube Objects and YouTube Co-localization datasets demonstrate that the proposed method of appearance fusion undoubtedly has an advantage over both the spatial fusion strategy and the current state-of-the-art video co-localization methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题