论文标题
使用可分离的fir-iir过滤器的机器学习成像核心
A Machine Learning Imaging Core using Separable FIR-IIR Filters
论文作者
论文摘要
我们提出了固定功能神经网络硬件,该硬件旨在以高效的方式执行像素到像素图像转换。我们使用完全可训练的固定培训神经网络来构建可以执行各种图像处理任务的模型。我们的模型使用压缩的跳过线和混合动力fir-iir块来减少延迟和硬件足迹。我们提出的机器学习成像核心(称为Magic)使用〜3mm^2的硅面积(在TSMC 16NM中),该区域比可比的像素密集的预测模型小的数量级要小。魔术不需要DDR带宽,不需要SRAM,几乎不需要外部记忆。每个魔术核心在500MHz时消耗56MW(215 MW的最大功率),并实现23吨/w/mm^2的节能吞吐量。魔术可以用作成像管道中的多功能图像处理块,近似于计算较重的图像处理应用程序,例如在移动设备的功率和硅区域内,诸如图像脱毛,去胶和着色。
We propose fixed-function neural network hardware that is designed to perform pixel-to-pixel image transformations in a highly efficient way. We use a fully trainable, fixed-topology neural network to build a model that can perform a wide variety of image processing tasks. Our model uses compressed skip lines and hybrid FIR-IIR blocks to reduce the latency and hardware footprint. Our proposed Machine Learning Imaging Core, dubbed MagIC, uses a silicon area of ~3mm^2 (in TSMC 16nm), which is orders of magnitude smaller than a comparable pixel-wise dense prediction model. MagIC requires no DDR bandwidth, no SRAM, and practically no external memory. Each MagIC core consumes 56mW (215 mW max power) at 500MHz and achieves an energy-efficient throughput of 23TOPS/W/mm^2. MagIC can be used as a multi-purpose image processing block in an imaging pipeline, approximating compute-heavy image processing applications, such as image deblurring, denoising, and colorization, within the power and silicon area limits of mobile devices.