较软的修剪，增量正则化

论文标题

较软的修剪，增量正则化

Softer Pruning, Incremental Regularization

论文作者

Cai, Linhang, An, Zhulin, Yang, Chuanguang, Xu, Yongjun

论文摘要

网络修剪被广泛用于压缩深神经网络（DNN）。软滤清器修剪（SFP）方法在训练过程中将修剪过滤器归零，同时在下一个训练时代更新它们。因此，修剪过滤器的训练有素的信息被完全删除。为了利用受过训练的修剪过滤器，我们提出了一种较软的过滤器修剪（SRFP）方法及其变体，渐近柔软的过滤器修剪（ASRFP），只需用单调的降低参数降低修剪的重量。我们的方法在各种网络，数据集和修剪速率上的表现都很好，也可以转移到重量修剪。在ILSVRC-2012上，ASRFP修剪RESNET-34参数的40％，TOP-1的1.63％和0.68％的TOP-5准确性提高。从理论上讲，SRFP和ASRFP是修剪过滤器的增量正则化。此外，我们注意到SRFP和ASRFP在降低收敛速度的同时取得更好的结果。

Network pruning is widely used to compress Deep Neural Networks (DNNs). The Soft Filter Pruning (SFP) method zeroizes the pruned filters during training while updating them in the next training epoch. Thus the trained information of the pruned filters is completely dropped. To utilize the trained pruned filters, we proposed a SofteR Filter Pruning (SRFP) method and its variant, Asymptotic SofteR Filter Pruning (ASRFP), simply decaying the pruned weights with a monotonic decreasing parameter. Our methods perform well across various networks, datasets and pruning rates, also transferable to weight pruning. On ILSVRC-2012, ASRFP prunes 40% of the parameters on ResNet-34 with 1.63% top-1 and 0.68% top-5 accuracy improvement. In theory, SRFP and ASRFP are an incremental regularization of the pruned filters. Besides, We note that SRFP and ASRFP pursue better results while slowing down the speed of convergence.

下载PDF全文

下载文献需遵守相关版权规定

论文标题