通过数据有效培训和减少CNN的偏差，对红外图像的背景不变分类

论文标题

通过数据有效培训和减少CNN的偏差，对红外图像的背景不变分类

Background Invariant Classification on Infrared Imagery by Data Efficient Training and Reducing Bias in CNNs

论文作者

Arif, Maliha, Yong, Calvin, Mahalanobis, Abhijit

论文摘要

即使卷积神经网络可以非常准确地在图像中对对象进行分类，但众所周知，网络的注意力可能并不总是在场景的语义重要区域上。已经观察到，网络通常学习与感兴趣对象无关的背景纹理。反过来，这使网络容易受到负面影响其性能的变化和变化。我们提出了一项新的两步训练程序，称为Split Training，以减少红外图像和RGB数据中CNN中的这种偏见。我们的分裂训练过程有两个步骤：使用MSE损失首先训练图像上的网络层，以背景为背景，以匹配同一网络的激活，当它使用没有背景的图像进行训练时；然后，随着这些层冻结，训练网络的其余部分，横向渗透损失以对对象进行分类。我们的培训方法在简单的CNN体系结构和诸如VGG和Densenet之类的深层CNN中的传统培训方法都优于传统培训程序，这些CNN使用了大量的硬件资源，并学会模仿人类的视觉，而不是具有更高准确性的人类视野，而不是具有更高的形状和结构。

Even though convolutional neural networks can classify objects in images very accurately, it is well known that the attention of the network may not always be on the semantically important regions of the scene. It has been observed that networks often learn background textures which are not relevant to the object of interest. In turn this makes the networks susceptible to variations and changes in the background which negatively affect their performance. We propose a new two-step training procedure called split training to reduce this bias in CNNs on both Infrared imagery and RGB data. Our split training procedure has two steps: using MSE loss first train the layers of the network on images with background to match the activations of the same network when it is trained using images without background; then with these layers frozen, train the rest of the network with cross-entropy loss to classify the objects. Our training method outperforms the traditional training procedure in both a simple CNN architecture, and deep CNNs like VGG and Densenet which use lots of hardware resources, and learns to mimic human vision which focuses more on shape and structure than background with higher accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题