论文标题
利用驱动AI革命的HW/SW优化和生态系统
Leveraging the HW/SW Optimizations and Ecosystems that Drive the AI Revolution
论文作者
论文摘要
本文介绍了有关如何架构,设计和优化深度神经网络(DNN)的最新概述,以提高性能并保留准确性。该论文涵盖了一组跨越整个机器学习处理管道的优化。我们引入了两种优化。第一个改变了DNN模型,需要重新训练,而第二个则不训练。我们专注于GPU优化,但我们认为提供的技术可以与其他AI推理平台一起使用。为了展示DNN模型优化,我们在流行的Edge AI推断平台(NVIDIA JETSON AGX Xavier)上改善了光流的最先进的深层网络体系结构之一,Raft Arxiv:2003.12039。
This paper presents a state-of-the-art overview on how to architect, design, and optimize Deep Neural Networks (DNNs) such that performance is improved and accuracy is preserved. The paper covers a set of optimizations that span the entire Machine Learning processing pipeline. We introduce two types of optimizations. The first alters the DNN model and requires NN re-training, while the second does not. We focus on GPU optimizations, but we believe the presented techniques can be used with other AI inference platforms. To demonstrate the DNN model optimizations, we improve one of the most advanced deep network architectures for optical flow, RAFT arXiv:2003.12039, on a popular edge AI inference platform (Nvidia Jetson AGX Xavier).