论文标题

嵌入式系统上有效的机器学习的汇编和优化

Compilation and Optimizations for Efficient Machine Learning on Embedded Systems

论文作者

Zhang, Xiaofan, Chen, Yao, Hao, Cong, Huang, Sitao, Li, Yuhong, Chen, Deming

论文摘要

深度神经网络(DNNS)在各种机器学习(ML)应用中取得了巨大的成功,在计算机视觉,自然语言处理和虚拟现实等方面提供了高质量的推理解决方案。但是,基于DNN的ML应用程序还可以增加计算和存储需求,这对具有有限的计算/储存资源的嵌入式系统尤其具有挑战性。挑战还来自各种特定应用的要求,包括实时响应,高通量性能和可靠的推理准确性。为了应对这些挑战,我们介绍了一系列有效的设计方法,包括有效的ML模型设计,定制的硬件加速器设计以及硬件/软件共同设计策略,以启用嵌入式系统上有效的ML应用程序。

Deep Neural Networks (DNNs) have achieved great success in a variety of machine learning (ML) applications, delivering high-quality inferencing solutions in computer vision, natural language processing, and virtual reality, etc. However, DNN-based ML applications also bring much increased computational and storage requirements, which are particularly challenging for embedded systems with limited compute/storage resources, tight power budgets, and small form factors. Challenges also come from the diverse application-specific requirements, including real-time responses, high-throughput performance, and reliable inference accuracy. To address these challenges, we introduce a series of effective design methodologies, including efficient ML model designs, customized hardware accelerator designs, and hardware/software co-design strategies to enable efficient ML applications on embedded systems.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源