ALF：基于自动编码器的低级过滤器共享，用于有效的卷积神经网络

论文标题

ALF：基于自动编码器的低级过滤器共享，用于有效的卷积神经网络

ALF: Autoencoder-based Low-rank Filter-sharing for Efficient Convolutional Neural Networks

论文作者

Frickenstein, Alexander, Vemparala, Manoj-Rohit, Fasfous, Nael, Hauenschild, Laura, Nagaraja, Naveen-Shankar, Unger, Christian, Stechele, Walter

论文摘要

缩小最先进的卷积神经网络的硬件要求与有限的资源约束嵌入式应用程序之间的差距是深度学习研究的下一个重大挑战。此类神经网络的计算复杂性和记忆足迹通常在资源约束环境中部署而令人生畏。在解决此问题的其他优化方法中强调了模型压缩技术，例如修剪。大多数现有技术都需要域专业知识或导致不规则的稀疏表示，这增加了在嵌入式硬件加速器上部署深度学习应用程序的负担。在本文中，我们提出了基于自动编码器的低级过滤器共享技术（ALF）。当应用于各种网络时，将ALF与最先进的修剪方法进行比较，以证明其在理论指标以及准确，确定性的硬件模型上的有效压缩功能。在我们的实验中，ALF显示网络参数的70 \％降低，操作中的61 \％和执行时间的41 \％降低，准确性损失最小。

Closing the gap between the hardware requirements of state-of-the-art convolutional neural networks and the limited resources constraining embedded applications is the next big challenge in deep learning research. The computational complexity and memory footprint of such neural networks are typically daunting for deployment in resource constrained environments. Model compression techniques, such as pruning, are emphasized among other optimization methods for solving this problem. Most existing techniques require domain expertise or result in irregular sparse representations, which increase the burden of deploying deep learning applications on embedded hardware accelerators. In this paper, we propose the autoencoder-based low-rank filter-sharing technique technique (ALF). When applied to various networks, ALF is compared to state-of-the-art pruning methods, demonstrating its efficient compression capabilities on theoretical metrics as well as on an accurate, deterministic hardware-model. In our experiments, ALF showed a reduction of 70\% in network parameters, 61\% in operations and 41\% in execution time, with minimal loss in accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题