论文标题
在NIC上运行神经网络
Running Neural Networks on the NIC
论文作者
论文摘要
在本文中,我们表明,商品可编程(网络接口卡)的数据平面NIC可以运行数据包监视应用程序所需的神经网络推理任务,而开销低。这一点尤其重要,因为数据传输成本对主机系统和专用的机器学习加速器,例如GPU,比处理任务本身更昂贵。我们在两个不同的NIC上设计和实施我们的系统-N3IC,我们表明它可以极大地受益于三个不同的网络监控案例,这些用例需要机器学习推断为一流的重要性。 N3IC可以对每秒数百万个网络流进行推断,同时以40GB/s的速度转发流量。与在通用CPU上实施的等效解决方案相比,N3IC可以提供100倍的处理延迟,而吞吐量增加了1.5倍。
In this paper we show that the data plane of commodity programmable (Network Interface Cards) NICs can run neural network inference tasks required by packet monitoring applications, with low overhead. This is particularly important as the data transfer costs to the host system and dedicated machine learning accelerators, e.g., GPUs, can be more expensive than the processing task itself. We design and implement our system -- N3IC -- on two different NICs and we show that it can greatly benefit three different network monitoring use cases that require machine learning inference as first-class-primitive. N3IC can perform inference for millions of network flows per second, while forwarding traffic at 40Gb/s. Compared to an equivalent solution implemented on a general purpose CPU, N3IC can provide 100x lower processing latency, with 1.5x increase in throughput.