论文标题
通过深度神经网络对Banach价值的高维功能的近乎最佳学习
Near-optimal learning of Banach-valued, high-dimensional functions via deep neural networks
论文作者
论文摘要
过去十年来,人们对将深度学习(DL)(CSE)(CSE)应用于越来越多。在计算机视觉,不确定性量化(UQ),遗传学,模拟和图像处理等应用中的令人印象深刻的结果的推动下,DL越来越多地取代了经典算法,并且似乎准备革新科学计算。但是,从数值分析的角度来看,DL尚未得到充分理解。从稳定性,鲁棒性,准确性和样本复杂性的角度来看,DL的效率和可靠性知之甚少。特别是,对参数PDE的近似解决方案是CSE的UQ的目标。培训数据通常会因错误而稀缺,并且损坏。此外,目标函数可能是在PDE解决方案空间中采用值的无限维平滑函数,通常是无限维的Banach空间。本文提供了具有已知和未知参数依赖性的此类功能的深神经网络(DNN)近似的参数,从而克服了维度的诅咒。我们建立了实用的存在定理,这些定理描述了具有独立于维度的架构规模和训练程序的DNN类,该类别基于最小化(正常的)$ \ ell^2 $ -LOSS,以达到近乎最佳的代数收敛速率。这些结果涉及与DNN一起使用Banach值恢复和多项式仿真的压缩传感的关键扩展。当近似参数PDE的解决方案时,我们的结果解释了所有误差源,即采样,优化,近似和物理离散化,并允许训练来自粗粒样品数据的高保真DNN近似值。我们的理论结果属于非侵入性方法的类别,为高维近似提供了经典方法的理论替代方法。
The past decade has seen increasing interest in applying Deep Learning (DL) to Computational Science and Engineering (CSE). Driven by impressive results in applications such as computer vision, Uncertainty Quantification (UQ), genetics, simulations and image processing, DL is increasingly supplanting classical algorithms, and seems poised to revolutionize scientific computing. However, DL is not yet well-understood from the standpoint of numerical analysis. Little is known about the efficiency and reliability of DL from the perspectives of stability, robustness, accuracy, and sample complexity. In particular, approximating solutions to parametric PDEs is an objective of UQ for CSE. Training data for such problems is often scarce and corrupted by errors. Moreover, the target function is a possibly infinite-dimensional smooth function taking values in the PDE solution space, generally an infinite-dimensional Banach space. This paper provides arguments for Deep Neural Network (DNN) approximation of such functions, with both known and unknown parametric dependence, that overcome the curse of dimensionality. We establish practical existence theorems that describe classes of DNNs with dimension-independent architecture size and training procedures based on minimizing the (regularized) $\ell^2$-loss which achieve near-optimal algebraic rates of convergence. These results involve key extensions of compressed sensing for Banach-valued recovery and polynomial emulation with DNNs. When approximating solutions of parametric PDEs, our results account for all sources of error, i.e., sampling, optimization, approximation and physical discretization, and allow for training high-fidelity DNN approximations from coarse-grained sample data. Our theoretical results fall into the category of non-intrusive methods, providing a theoretical alternative to classical methods for high-dimensional approximation.