SOL：减少维护开销，以将硬件支持集成到AI框架中

论文标题

SOL：减少维护开销，以将硬件支持集成到AI框架中

SOL: Reducing the Maintenance Overhead for Integrating Hardware Support into AI Frameworks

论文作者

Weber, Nicolas

论文摘要

人们对人工智能（AI）的兴趣越来越高，提出了对高度优化和复杂的AI框架的需求。从基于LUA的火炬开始，随着时间的流逝，许多框架都出现了，例如Theano，Caffe，Chainer，Cntk，Mxnet，Pytorch，DL4J或Tensorflow。所有这些都提供了高级的脚本API，使用户可以轻松设计神经网络并在各种硬件上运行它们。用户通常看不到的是在这些框架中付出的高度努力以提供峰值执行性能。尽管主流CPU和GPU具有在开源社区中拥有广泛传播用户基础的“奢侈品”，但较少的主流CPU，GPU或加速器供应商需要付出很大的努力来努力以这些框架为其硬件提供支持。这不仅包括开发高效的计算库，例如Cudnn，Onednn或Vednn，而且还支持越来越多的简单计算操作，例如求和和乘法。如今，这些框架中的每一个都支持数百个独特的操作，并具有各种尺寸，形状和数据类型的张量，最终以每种设备类型所需的数千个计算内核。并且操作数量不断增加。这就是为什么NEC实验室欧洲几年前开始开发SOL AI优化项目，以向用户提供最佳性能，同时保持维护负担最低。

The increased interest in Artificial Intelligence (AI) raised the need for highly optimized and sophisticated AI frameworks. Starting with the Lua-based Torch many frameworks have emerged over time, such as Theano, Caffe, Chainer, CNTK, MxNet, PyTorch, DL4J, or TensorFlow. All of these provide a high level scripting API that allows users to easily design neural networks and run these on various kinds of hardware. What the user usually does not see is the high effort put into these frameworks to provide peak execution performance. While mainstream CPUs and GPUs have the "luxury" to have a wide spread user base in the open source community, less mainstream CPU, GPU or accelerator vendors need to put in a high effort to get their hardware supported by these frameworks. This includes not only the development of highly efficient compute libraries such as CUDNN, OneDNN or VEDNN but also supporting an ever growing number of simpler compute operations such as summation and multiplications. Each of these frameworks, nowadays, supports several hundred of unique operations, with tensors of various sizes, shapes and data types, which end up in thousands of compute kernels required for each device type. And the number of operations keeps increasing. That is why NEC Laboratories Europe started developing the SOL AI Optimization project already years ago, to deliver optimal performance to users while keeping the maintenance burden minimal.

下载PDF全文

下载文献需遵守相关版权规定

论文标题