用于语义分割的光谱分析，并在特征截断和弱注释方面进行应用

论文标题

用于语义分割的光谱分析，并在特征截断和弱注释方面进行应用

Spectral Analysis for Semantic Segmentation with Applications on Feature Truncation and Weak Annotation

论文作者

Chen, Li-Wei, Chiu, Wei-Chen, Wu, Chin-Tien

论文摘要

众所周知，语义分割神经网络（SSNN）产生密集的分割图来解决对象的边界，同时限制了对下采样的网格的预测以减轻计算成本。存在诸如U-NET之类的SSNN的准确性和训练成本之间的显着平衡。我们提出了一个光谱分析，以研究下采样网格分辨率之间的相关性，损失函数和SSNN的准确性。通过分析频域中的网络后传播过程，我们发现传统的损耗函数，跨膜片和CNN的关键特征主要受分割标签的低频组件的影响。我们的发现可以通过多种方式应用于SSNN，包括（i）确定有效的低分辨率网格，以解决分割图（ii）通过截断高频解码器来节省计算成本的高频解码器，以及（iii）使用块范围弱的注释来节省标签时间。本文显示的实验结果与我们针对诸如DeepLab V3+和Deep Cotnegation Net（DAN）等网络的光谱分析一致。

It is well known that semantic segmentation neural networks (SSNNs) produce dense segmentation maps to resolve the objects' boundaries while restrict the prediction on down-sampled grids to alleviate the computational cost. A striking balance between the accuracy and the training cost of the SSNNs such as U-Net exists. We propose a spectral analysis to investigate the correlations among the resolution of the down sampled grid, the loss function and the accuracy of the SSNNs. By analyzing the network back-propagation process in frequency domain, we discover that the traditional loss function, cross-entropy, and the key features of CNN are mainly affected by the low-frequency components of segmentation labels. Our discoveries can be applied to SSNNs in several ways including (i) determining an efficient low resolution grid for resolving the segmentation maps (ii) pruning the networks by truncating the high frequency decoder features for saving computation costs, and (iii) using block-wise weak annotation for saving the labeling time. Experimental results shown in this paper agree with our spectral analysis for the networks such as DeepLab V3+ and Deep Aggregation Net (DAN).

下载PDF全文

下载文献需遵守相关版权规定

论文标题