论文标题

您需要再次阅读:视频中的瞬间检索多粒性感知网络

You Need to Read Again: Multi-granularity Perception Network for Moment Retrieval in Videos

论文作者

Sun, Xin, Wang, Xuan, Gao, Jialin, Liu, Qiong, Zhou, Xi

论文摘要

视频中的时刻检索是一项具有挑战性的任务,旨在在给定句子描述的未修剪视频中检索最相关的视频时刻。以前的方法倾向于以粗糙的方式进行自模式学习和跨模式相互作用,从而忽略了视频内容,查询环境及其对齐方式中包含的细颗粒线索。为此,我们提出了一个新型的多粒性感知网络(MGPN),该网络可感知在多粒度水平上的模式内和模式间信息。具体来说,我们将力矩检索作为一项多项选择阅读理解任务,并将人类阅读策略整合到我们的框架中。使用粗粒的特征编码器和共发机制来获得对模式内和模式间信息的初步感知。然后,引入了细颗粒的特征编码器和条件交互模块,以增强受人类如何解决阅读理解问题启发的初始感知。此外,为了减轻某些现有方法的巨大计算负担,我们进一步设计了一个有效的选择比较模块,并在质量损失的情况下降低了隐藏尺寸。关于Charades-STA,Tacos和ActivityNet字幕数据集的广泛实验表明,我们的解决方案的表现优于现有的最新方法。代码可在github.com/huntersxsx/mgpn上找到。

Moment retrieval in videos is a challenging task that aims to retrieve the most relevant video moment in an untrimmed video given a sentence description. Previous methods tend to perform self-modal learning and cross-modal interaction in a coarse manner, which neglect fine-grained clues contained in video content, query context, and their alignment. To this end, we propose a novel Multi-Granularity Perception Network (MGPN) that perceives intra-modality and inter-modality information at a multi-granularity level. Specifically, we formulate moment retrieval as a multi-choice reading comprehension task and integrate human reading strategies into our framework. A coarse-grained feature encoder and a co-attention mechanism are utilized to obtain a preliminary perception of intra-modality and inter-modality information. Then a fine-grained feature encoder and a conditioned interaction module are introduced to enhance the initial perception inspired by how humans address reading comprehension problems. Moreover, to alleviate the huge computation burden of some existing methods, we further design an efficient choice comparison module and reduce the hidden size with imperceptible quality loss. Extensive experiments on Charades-STA, TACoS, and ActivityNet Captions datasets demonstrate that our solution outperforms existing state-of-the-art methods. Codes are available at github.com/Huntersxsx/MGPN.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源