论文标题
图像增强基于模式的稀疏性,用于实时推断移动设备
An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices
论文作者
论文摘要
重量修剪被广泛认为是消除深神经网络(DNN)冗余的直接有效方法,从而在各种平台上达到了加速。但是,大多数修剪技术本质上是模型准确性和规律性之间的权衡,这会导致推理准确性受损和有限的在设备加速度性能。为了解决问题,我们引入了一个新的稀疏维度,即基于模式的稀疏性,包括模式和连接性的稀疏性,并且变得非常准确且友好。借助精心设计的模式,提议的修剪前所未有,始终如一地在不同的DNN结构和数据集上提高精度,并更好地提取功能,并且我们的模式感知的修剪框架还可以实现模式图书馆的提取,模式选择,模式选择,模式,连接性修剪以及同时训练。我们对新的基于模式的稀疏性的方法自然符合编译器优化,以在移动平台上进行高效的DNN执行。据我们所知,这是移动设备首次获得大规模DNN模型的实时推断,这要归功于基于模式的稀疏性的独特空间属性以及编译器的代码生成能力的帮助。
Weight pruning has been widely acknowledged as a straightforward and effective method to eliminate redundancy in Deep Neural Networks (DNN), thereby achieving acceleration on various platforms. However, most of the pruning techniques are essentially trade-offs between model accuracy and regularity which lead to impaired inference accuracy and limited on-device acceleration performance. To solve the problem, we introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly. With carefully designed patterns, the proposed pruning unprecedentedly and consistently achieves accuracy enhancement and better feature extraction ability on different DNN structures and datasets, and our pattern-aware pruning framework also achieves pattern library extraction, pattern selection, pattern and connectivity pruning and weight training simultaneously. Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms. To the best of our knowledge, it is the first time that mobile devices achieve real-time inference for the large-scale DNN models thanks to the unique spatial property of pattern-based sparsity and the help of the code generation capability of compilers.