论文标题
使用AutoML的加速器感知神经网络设计
Accelerator-aware Neural Network Design using AutoML
论文作者
论文摘要
尽管神经网络硬件加速器提供了大量的原始计算吞吐量,但必须为基础硬件体系结构共同设计部署在其上的模型,以获得最佳的系统性能。我们介绍了一类使用硬件感知神经体系结构搜索设计的计算机视觉模型,并在Google的Neural Network Hardware Accelerator上进行自定义,用于在Edge TPU上运行,用于低功耗,边缘设备。对于珊瑚设备中的Edge TPU,这些模型可以实现实时图像分类性能,同时仅在数据中心运行较大的,较重的型号的模型通常可以看到准确性。在Pixel 4的Edge TPU上,这些模型改善了现有SOTA移动模型的准确性延迟权衡。
While neural network hardware accelerators provide a substantial amount of raw compute throughput, the models deployed on them must be co-designed for the underlying hardware architecture to obtain the optimal system performance. We present a class of computer vision models designed using hardware-aware neural architecture search and customized to run on the Edge TPU, Google's neural network hardware accelerator for low-power, edge devices. For the Edge TPU in Coral devices, these models enable real-time image classification performance while achieving accuracy typically seen only with larger, compute-heavy models running in data centers. On Pixel 4's Edge TPU, these models improve the accuracy-latency tradeoff over existing SoTA mobile models.