论文标题
基于深层层次上下文网络的图像注释
Image Annotation based on Deep Hierarchical Context Networks
论文作者
论文摘要
上下文建模是视觉识别最肥沃的子场之一,旨在设计判别图像表示,同时结合其内在和外在关系。但是,目前,上下文建模的潜力尚未得到充实,大多数现有解决方案都是无上下文或仅限于简单手工制作的几何关系。我们在本文中介绍了DHCN:一种新颖的深层层次上下文网络,该网络利用不同的上下文来源,包括几何和语义关系。所提出的方法基于将保真度项,上下文标准和正规化程序混合的目标函数的最小化。该目标函数的解决方案定义了双层层次上下文网络的体系结构;该网络的第一级捕获场景几何形状,而第二个网络对应于语义关系。我们通过训练其潜在的深层网络来解决此表示的学习问题,其参数对应于最受影响的双层上下文关系,并使用挑战性的ImageClef基准评估其在图像注释上的性能。
Context modeling is one of the most fertile subfields of visual recognition which aims at designing discriminant image representations while incorporating their intrinsic and extrinsic relationships. However, the potential of context modeling is currently underexplored and most of the existing solutions are either context-free or restricted to simple handcrafted geometric relationships. We introduce in this paper DHCN: a novel Deep Hierarchical Context Network that leverages different sources of contexts including geometric and semantic relationships. The proposed method is based on the minimization of an objective function mixing a fidelity term, a context criterion and a regularizer. The solution of this objective function defines the architecture of a bi-level hierarchical context network; the first level of this network captures scene geometry while the second one corresponds to semantic relationships. We solve this representation learning problem by training its underlying deep network whose parameters correspond to the most influencing bi-level contextual relationships and we evaluate its performances on image annotation using the challenging ImageCLEF benchmark.