学习实例表示银行用于航空场景分类

论文标题

学习实例表示银行用于航空场景分类

Learning Instance Representation Banks for Aerial Scene Classification

论文作者

Yi, Jingjun, Zhou, Beichen

论文摘要

由于鸟类的视野，与自然场景相比，在物体分布和空间排列方面，航空场景更为复杂，因此学习歧视场景表示仍然具有挑战性。最新的解决方案设计\ textit {局部语义描述符}，因此可以正确突出关注区域（ROI）。但是，每个本地描述符的描述功能有限，整体场景表示仍有待完善。在本文中，我们通过设计一个名为\ textit {instance bank}（irb）的新颖表示集来解决此问题，该集合在多个实例学习（MIL）公式下统一了多个本地描述符。这个统一的框架并不微不足道，因为所有本地语义描述符都可以与同一场景方案保持一致，从而增强了场景表示能力。具体而言，我们的IRB学习框架包括一个主链，实例表示库，语义融合模块和场景方案对齐损失函数。所有组件均以端到端的方式组织。在三个空中场景基准上进行的广泛实验表明，我们提出的方法的表现优于最先进的方法。

Aerial scenes are more complicated in terms of object distribution and spatial arrangement than natural scenes due to the bird view, and thus remain challenging to learn discriminative scene representation. Recent solutions design \textit{local semantic descriptors} so that region of interests (RoIs) can be properly highlighted. However, each local descriptor has limited description capability and the overall scene representation remains to be refined. In this paper, we solve this problem by designing a novel representation set named \textit{instance representation bank} (IRB), which unifies multiple local descriptors under the multiple instance learning (MIL) formulation. This unified framework is not trivial as all the local semantic descriptors can be aligned to the same scene scheme, enhancing the scene representation capability. Specifically, our IRB learning framework consists of a backbone, an instance representation bank, a semantic fusion module and a scene scheme alignment loss function. All the components are organized in an end-to-end manner. Extensive experiments on three aerial scene benchmarks demonstrate that our proposed method outperforms the state-of-the-art approaches by a large margin.

下载PDF全文

下载文献需遵守相关版权规定

论文标题