探索CNN概括到以前看不见的尺度上的能力

论文标题

探索CNN概括到以前看不见的尺度上的能力

Exploring the ability of CNNs to generalise to previously unseen scales over wide scale ranges

论文作者

Jansson, Ylva, Lindeberg, Tony

论文摘要

处理大规模变化的能力对于许多现实世界视觉任务至关重要。深层网络中处理量表的直接方法是在一组比例通道中同时处理几个尺度的图像。然后，原则上可以通过使用比例通道之间的重量共享以及比例通道的输出来实现比例不变性。但是，以前尚未探索这种规模通道网络在训练集中不存在的量表的能力，以前尚未探索。因此，我们介绍了比例通道网络的不变性和协方差属性的理论分析，并对不同类型的比例通道网络推广到以前看不见的尺度的能力进行了实验评估。我们确定了先前方法的局限性，并提出了一种新型的Foveated scale通道体系结构，其中比例通道随着分辨率的减少而越来越大的图像过程越来越大。我们提出的FovMax和Fovavg网络在单个训练数据上的训练时也几乎在8的范围内表现相同，并且确实可以改善小样本制度。

The ability to handle large scale variations is crucial for many real world visual tasks. A straightforward approach for handling scale in a deep network is to process an image at several scales simultaneously in a set of scale channels. Scale invariance can then, in principle, be achieved by using weight sharing between the scale channels together with max or average pooling over the outputs from the scale channels. The ability of such scale channel networks to generalise to scales not present in the training set over significant scale ranges has, however, not previously been explored. We, therefore, present a theoretical analysis of invariance and covariance properties of scale channel networks and perform an experimental evaluation of the ability of different types of scale channel networks to generalise to previously unseen scales. We identify limitations of previous approaches and propose a new type of foveated scale channel architecture, where the scale channels process increasingly larger parts of the image with decreasing resolution. Our proposed FovMax and FovAvg networks perform almost identically over a scale range of 8, also when training on single scale training data, and do also give improvements in the small sample regime.

下载PDF全文

下载文献需遵守相关版权规定

论文标题