论文标题

组合脑外科医生:在神经网络中相互取消的修剪权重

The Combinatorial Brain Surgeon: Pruning Weights That Cancel One Another in Neural Networks

论文作者

Yu, Xin, Serra, Thiago, Ramalingam, Srikumar, Zhe, Shandian

论文摘要

如果神经网络更大,即使所产生的模型过度参数化,神经网络也倾向于通过培训获得更好的准确性。然而,在训练之前,期间或之后,仔细删除此类多余参数也可能会产生具有相似精度甚至提高的模型。在许多情况下,可以通过启发式方法奇怪地实现,就像去除具有最小绝对价值的权重的一定比例一样 - 尽管大小并不是重量相关性的完美代理。以这样的前提是,从修剪获得更好的性能取决于删除多个权重的综合效果的情况,我们重新审视了基于影响的基于影响的经典方法之一:最佳脑外科医生(obs)。我们提出了一种可拖动的启发式方法,用于求解观察的组合扩展,其中我们选择了同时删除的权重,以及剩余权重的系统更新。我们的选择方法在高稀疏性下的其他方法优于其他方法,即使与其他方法结合使用,重量更新也是有利的。

Neural networks tend to achieve better accuracy with training if they are larger -- even if the resulting models are overparameterized. Nevertheless, carefully removing such excess parameters before, during, or after training may also produce models with similar or even improved accuracy. In many cases, that can be curiously achieved by heuristics as simple as removing a percentage of the weights with the smallest absolute value -- even though magnitude is not a perfect proxy for weight relevance. With the premise that obtaining significantly better performance from pruning depends on accounting for the combined effect of removing multiple weights, we revisit one of the classic approaches for impact-based pruning: the Optimal Brain Surgeon(OBS). We propose a tractable heuristic for solving the combinatorial extension of OBS, in which we select weights for simultaneous removal, as well as a systematic update of the remaining weights. Our selection method outperforms other methods under high sparsity, and the weight update is advantageous even when combined with the other methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源