论文标题
迭代求解线性系统的深层共轭方向方法
A Deep Conjugate Direction Method for Iteratively Solving Linear Systems
论文作者
论文摘要
我们提出了一种新颖的深度学习方法,以近似大型,稀疏,对称,正定的方程式线性系统的解决方案。这些系统源于应用科学中的许多问题,例如,用于部分微分方程的数值方法。近似于这些系统的解决方案的算法通常是需要解决方案的问题的瓶颈,尤其是对于需要数百万未知数的现代应用程序。实际上,数十年来已经研究了数值线性代数技术,以减轻这种计算负担。最近,数据驱动的技术还显示了这些问题的希望。由迭代性选择搜索说明以最小化近似误差的矩阵规范的偶然梯度算法的激励,我们设计了一种使用深层神经网络来通过数据驱动的搜索方向加速收敛的方法。我们的方法利用精心选择的卷积网络将线性操作员的倒数近似为任意常数。我们使用无监督的学习训练网络,其损失函数等于输入和系统矩阵之间的$ l^2 $差异网络评估,其中大约逆向中的未指定常数被解释了。我们证明了我们的方法对空间离散的泊松方程的功效,在计算流体动力学应用中产生了数百万个自由度。与最先进的学习方法不同,我们的算法能够将线性系统降低到少数迭代中给定的耐受性,而与问题大小无关。此外,我们的方法将有效地推广到培训期间遇到的各种系统之外的各种系统。
We present a novel deep learning approach to approximate the solution of large, sparse, symmetric, positive-definite linear systems of equations. These systems arise from many problems in applied science, e.g., in numerical methods for partial differential equations. Algorithms for approximating the solution to these systems are often the bottleneck in problems that require their solution, particularly for modern applications that require many millions of unknowns. Indeed, numerical linear algebra techniques have been investigated for many decades to alleviate this computational burden. Recently, data-driven techniques have also shown promise for these problems. Motivated by the conjugate gradients algorithm that iteratively selects search directions for minimizing the matrix norm of the approximation error, we design an approach that utilizes a deep neural network to accelerate convergence via data-driven improvement of the search directions. Our method leverages a carefully chosen convolutional network to approximate the action of the inverse of the linear operator up to an arbitrary constant. We train the network using unsupervised learning with a loss function equal to the $L^2$ difference between an input and the system matrix times the network evaluation, where the unspecified constant in the approximate inverse is accounted for. We demonstrate the efficacy of our approach on spatially discretized Poisson equations with millions of degrees of freedom arising in computational fluid dynamics applications. Unlike state-of-the-art learning approaches, our algorithm is capable of reducing the linear system residual to a given tolerance in a small number of iterations, independent of the problem size. Moreover, our method generalizes effectively to various systems beyond those encountered during training.