论文标题
在互联网边缘的神经网络的动态硬修剪
Dynamic Hard Pruning of Neural Networks at the Edge of the Internet
论文作者
论文摘要
神经网络(NN)虽然成功地应用于几个人工智能任务,但通常不必要地被过度参与。在Edge/Fog计算中,这可能会使他们对资源受限设备的培训过高,这与当前从远程数据中心到本地受限设备的智能的趋势形成鲜明对比。因此,我们研究了培训有效的NN模型的问题,该模型对具有固定的,潜在的记忆预算的受约束设备。我们针对的是资源有效和性能有效的技术,同时可以实现重大的网络压缩。我们的动态硬修剪(DYNHP)技术在训练过程中会逐步修剪网络,从而确定有助于模型准确性的神经元。 DynHP可以减小最终神经网络的可调尺寸,并减少训练期间的NN记忆占用率。 \ emph {动态批处理大小}方法重复了释放记忆,以抵消由硬修剪策略引起的准确性降低,从而提高了其收敛性和有效性。我们通过在三个公共数据集上的可重复实验来评估DYNHP的性能,并将其与参考竞争对手进行比较。结果表明,DYNHP将ANN压缩到$ 10 $ $ $ $ times,而无需大幅度的性能下降($ 3.5 \%$ $额外的错误W.R.T.竞争对手),最高降低了$ 80 \%\%\%$ $。
Neural Networks (NN), although successfully applied to several Artificial Intelligence tasks, are often unnecessarily over-parametrised. In edge/fog computing, this might make their training prohibitive on resource-constrained devices, contrasting with the current trend of decentralising intelligence from remote data centres to local constrained devices. Therefore, we investigate the problem of training effective NN models on constrained devices having a fixed, potentially small, memory budget. We target techniques that are both resource-efficient and performance effective while enabling significant network compression. Our Dynamic Hard Pruning (DynHP) technique incrementally prunes the network during training, identifying neurons that marginally contribute to the model accuracy. DynHP enables a tunable size reduction of the final neural network and reduces the NN memory occupancy during training. Freed memory is reused by a \emph{dynamic batch sizing} approach to counterbalance the accuracy degradation caused by the hard pruning strategy, improving its convergence and effectiveness. We assess the performance of DynHP through reproducible experiments on three public datasets, comparing them against reference competitors. Results show that DynHP compresses a NN up to $10$ times without significant performance drops (up to $3.5\%$ additional error w.r.t. the competitors), reducing up to $80\%$ the training memory occupancy.