嵌入式系统上有效的机器学习的汇编和优化

论文标题

嵌入式系统上有效的机器学习的汇编和优化

Compilation and Optimizations for Efficient Machine Learning on Embedded Systems

论文作者

Zhang, Xiaofan, Chen, Yao, Hao, Cong, Huang, Sitao, Li, Yuhong, Chen, Deming

论文摘要

深度神经网络（DNNS）在各种机器学习（ML）应用中取得了巨大的成功，在计算机视觉，自然语言处理和虚拟现实等方面提供了高质量的推理解决方案。但是，基于DNN的ML应用程序还可以增加计算和存储需求，这对具有有限的计算/储存资源的嵌入式系统尤其具有挑战性。挑战还来自各种特定应用的要求，包括实时响应，高通量性能和可靠的推理准确性。为了应对这些挑战，我们介绍了一系列有效的设计方法，包括有效的ML模型设计，定制的硬件加速器设计以及硬件/软件共同设计策略，以启用嵌入式系统上有效的ML应用程序。

Deep Neural Networks (DNNs) have achieved great success in a variety of machine learning (ML) applications, delivering high-quality inferencing solutions in computer vision, natural language processing, and virtual reality, etc. However, DNN-based ML applications also bring much increased computational and storage requirements, which are particularly challenging for embedded systems with limited compute/storage resources, tight power budgets, and small form factors. Challenges also come from the diverse application-specific requirements, including real-time responses, high-throughput performance, and reliable inference accuracy. To address these challenges, we introduce a series of effective design methodologies, including efficient ML model designs, customized hardware accelerator designs, and hardware/software co-design strategies to enable efficient ML applications on embedded systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题