论文标题
而不是重写用于机器学习的外国代码,而是自动合成快速渐变
Instead of Rewriting Foreign Code for Machine Learning, Automatically Synthesize Fast Gradients
论文作者
论文摘要
将可区分的编程技术和机器学习算法应用于外国程序,需要开发人员在机器学习框架中重写其代码,或者以其他方式提供外国代码的导数。本文介绍了酶,该酶是一种用于LLVM编译器框架的高性能自动分化(AD)编译器插件,能够合成在LLVM中间表示(IR)中表达的静态可分析程序的梯度。酶合成了用任何语言编写的程序编写的梯度,其编译器针对LLVM IR,包括C,C ++,Fortran,Julia,Rust,Swift,Mlir等,从而以这些语言提供本机AD功能。与传统的源代码和运营商超过载荷工具不同,酶在优化的IR上执行广告。在包括Microsoft的Adbench在内的机器学习专注的基准套件上,优化IR的广告在IR上实现了4.5倍的几何平均速度在IR上的平均速度为4.5倍,然后才能实现酶实现最新的性能。 Pytorch和TensorFlow的包装酶为外国代码的梯度提供了方便的访问,并具有最先进的性能,使外国代码能够直接纳入现有的机器学习工作流程中。
Applying differentiable programming techniques and machine learning algorithms to foreign programs requires developers to either rewrite their code in a machine learning framework, or otherwise provide derivatives of the foreign code. This paper presents Enzyme, a high-performance automatic differentiation (AD) compiler plugin for the LLVM compiler framework capable of synthesizing gradients of statically analyzable programs expressed in the LLVM intermediate representation (IR). Enzyme synthesizes gradients for programs written in any language whose compiler targets LLVM IR including C, C++, Fortran, Julia, Rust, Swift, MLIR, etc., thereby providing native AD capabilities in these languages. Unlike traditional source-to-source and operator-overloading tools, Enzyme performs AD on optimized IR. On a machine-learning focused benchmark suite including Microsoft's ADBench, AD on optimized IR achieves a geometric mean speedup of 4.5x over AD on IR before optimization allowing Enzyme to achieve state-of-the-art performance. Packaging Enzyme for PyTorch and TensorFlow provides convenient access to gradients of foreign code with state-of-the art performance, enabling foreign code to be directly incorporated into existing machine learning workflows.