哪个缩放规则适用于人工神经网络

论文标题

哪个缩放规则适用于人工神经网络

Which scaling rule applies to Artificial Neural Networks

论文作者

Végh, János

论文摘要

经验表明，合作和交流计算系统，包括隔离的单个处理器，具有严重的性能限制。冯·诺伊曼（von Neumann）在他的经典“第一草案”中警告说，使用“太快的处理器”杀死了他的简单“程序”（但不是他的计算模型！）；此外，使用经典的计算范式模仿神经元操作，这是不合适的。 Amdahl补充说，包括许多处理器的大型机器具有固有的劣势。鉴于Ann的组件彼此之间进行了大量的交流，因此它们是由设计/制造用于常规计算的大量组件构建的，此外，他们还尝试使用不当技术解决方案模仿生物学操作，因此他们可实现的有效载荷计算绩效在概念上是适中的。基于AI的系统产生的工作负载类型可导致出色的有效载荷计算性能，其设计/技术将其大小限制在“玩具”级别上方：基于处理器的ANN系统的缩放范围非常非线性。鉴于ANN系统的扩散和规模不断增长，我们建议您提前估算设备或应用的效率。通过分析已发布的测量结果，我们提供了证据表明，数据传输时间的作用极大地影响了ANN的性能和可行性。讨论了一些主要的理论限制因素，ANN的层结构及其技术实施方法如何影响其效率。该论文始于冯·诺伊曼（Von Neumann）的原始模型，而没有忽略与处理时间相距甚远的转移时间。为Amdahl定律提供了适当的解释和处理。它表明，在这种解释中，Amdahl的定律正确地描述了ANN。

The experience shows that cooperating and communicating computing systems, comprising segregated single processors, have severe performance limitations. In his classic "First Draft" von Neumann warned that using a "too fast processor" vitiates his simple "procedure" (but not his computing model!); furthermore, that using the classic computing paradigm for imitating neuronal operations, is unsound. Amdahl added that large machines, comprising many processors, have an inherent disadvantage. Given that ANN's components are heavily communicating with each other, they are built from a large number of components designed/fabricated for use in conventional computing, furthermore they attempt to mimic biological operation using improper technological solutions, their achievable payload computing performance is conceptually modest. The type of workload that AI-based systems generate leads to an exceptionally low payload computational performance, and their design/technology limits their size to just above the "toy" level systems: the scaling of processor-based ANN systems is strongly nonlinear. Given the proliferation and growing size of ANN systems, we suggest ideas to estimate in advance the efficiency of the device or application. Through analyzing published measurements we provide evidence that the role of data transfer time drastically influences both ANNs performance and feasibility. It is discussed how some major theoretical limiting factors, ANN's layer structure and their methods of technical implementation of communication affect their efficiency. The paper starts from von Neumann's original model, without neglecting the transfer time apart from processing time; derives an appropriate interpretation and handling for Amdahl's law. It shows that, in that interpretation, Amdahl's Law correctly describes ANNs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题