关于基于歧管的替代物和深层神经操作员过度参数化的影响

论文标题

关于基于歧管的替代物和深层神经操作员过度参数化的影响

On the influence of over-parameterization in manifold based surrogates and deep neural operators

论文作者

Kontolati, Katiana, Goswami, Somdatta, Shields, Michael D., Karniadakis, George Em

论文摘要

为表现出高度非平滑动力的复杂物理化学过程构建准确且可推广的近似值是具有挑战性的。在这项工作中，我们提出了新的发展，并对两种有前途的方法进行比较：基于多项式的混沌扩展（M-PCE）和深神经操作员（DeepOnet），我们研究了过度参数化对概括的影响。我们通过求解具有不确定性来源的2D时依赖性的Brusselator反应 - 扩散系统来证明这些方法的性能，从而对两个物种之间的自催化化学反应进行建模。我们首先提出了M-PCE的扩展，该扩展是通过在两个单独的输入函数和输出QOIS形成的潜在空间之间构造映射的映射。为了提高deponet的准确性，我们在损失函数中引入了体重自适应。我们证明，M-PCE和DEADONET的性能对于相对平滑的输入输出映射的情况是可比的。但是，当考虑高度非平滑动力学时，deponet显示出更高的精度。我们还发现，对于M-PCE，适度的过度参数化会导致分布内外的更好的概括，而积极的过度参数会导致过度拟合。相比之下，甚至高度高度参数化的deponet可以使平滑和非平滑动力学都更好地概括。此外，我们将上述模型的性能与另一个操作员学习模型，即傅立叶神经操作员进行了比较，并表明其过度参数也会导致更好的概括。我们的研究表明，M-PCE可以以非常低的培训成本提供非常良好的准确性，而高度参数化的DeepOnet可以为噪声提供更好的准确性和鲁棒性，但要以更高的培训成本提供。在这两种方法中，推论成本都可以忽略不计。

Constructing accurate and generalizable approximators for complex physico-chemical processes exhibiting highly non-smooth dynamics is challenging. In this work, we propose new developments and perform comparisons for two promising approaches: manifold-based polynomial chaos expansion (m-PCE) and the deep neural operator (DeepONet), and we examine the effect of over-parameterization on generalization. We demonstrate the performance of these methods in terms of generalization accuracy by solving the 2D time-dependent Brusselator reaction-diffusion system with uncertainty sources, modeling an autocatalytic chemical reaction between two species. We first propose an extension of the m-PCE by constructing a mapping between latent spaces formed by two separate embeddings of input functions and output QoIs. To enhance the accuracy of the DeepONet, we introduce weight self-adaptivity in the loss function. We demonstrate that the performance of m-PCE and DeepONet is comparable for cases of relatively smooth input-output mappings. However, when highly non-smooth dynamics is considered, DeepONet shows higher accuracy. We also find that for m-PCE, modest over-parameterization leads to better generalization, both within and outside of distribution, whereas aggressive over-parameterization leads to over-fitting. In contrast, an even highly over-parameterized DeepONet leads to better generalization for both smooth and non-smooth dynamics. Furthermore, we compare the performance of the above models with another operator learning model, the Fourier Neural Operator, and show that its over-parameterization also leads to better generalization. Our studies show that m-PCE can provide very good accuracy at very low training cost, whereas a highly over-parameterized DeepONet can provide better accuracy and robustness to noise but at higher training cost. In both methods, the inference cost is negligible.

下载PDF全文

下载文献需遵守相关版权规定

论文标题