论文标题

知识指导的双向注意网络,用于人类对象互动检测

Knowledge Guided Bidirectional Attention Network for Human-Object Interaction Detection

论文作者

Huang, Jingjia, Yang, Baixiang

论文摘要

人类对象相互作用(HOI)检测是一项具有挑战性的任务,需要区分人类对象对之间的相互作用。基于注意力的关系解析是HOI中使用的一种流行而有效的策略。但是,当前方法以“自下而上”的方式执行关系解析。我们认为,在HOI中,自下而上解析策略的独立使用是违反直觉的,可能导致注意力的扩散。因此,我们将新颖的知识引导自上而下的注意力引入HOI,并建议将关系解析建模为“外观和搜索”过程:执行场景 - 文本建模(即外观),然后考虑到目标对的知识,搜索视觉线索以歧视这对之间的交互作用。我们通过在基于单个编码器模型的模型中统一自下而上和自上而下的注意来实现该过程。实验结果表明,我们的模型在V-Coco和Hico-Det数据集上实现了竞争性能。

Human Object Interaction (HOI) detection is a challenging task that requires to distinguish the interaction between a human-object pair. Attention based relation parsing is a popular and effective strategy utilized in HOI. However, current methods execute relation parsing in a "bottom-up" manner. We argue that the independent use of the bottom-up parsing strategy in HOI is counter-intuitive and could lead to the diffusion of attention. Therefore, we introduce a novel knowledge-guided top-down attention into HOI, and propose to model the relation parsing as a "look and search" process: execute scene-context modeling (i.e. look), and then, given the knowledge of the target pair, search visual clues for the discrimination of the interaction between the pair. We implement the process via unifying the bottom-up and top-down attention in a single encoder-decoder based model. The experimental results show that our model achieves competitive performance on the V-COCO and HICO-DET datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源