论文标题
吸引人的学习操作员
Learning Operators with Coupled Attention
论文作者
论文摘要
监督操作员学习是一种新兴的机器学习范式,其应用程序可以建模时空动态系统的演变,并近似功能数据之间的一般黑盒关系。我们提出了一种新型的操作员学习方法,LOCA(关注耦合的学习操作员),这是由于注意机制的最新成功而动机。在我们的体系结构中,输入函数被映射到有限的功能集,然后将其平均为取决于输出查询位置的注意力权重。通过将这些注意力权重与整体变换耦合,LOCA能够明确学习目标输出功能中的相关性,即使训练集测量中的输出功能数量非常小,也使我们能够近似非线性运算符。我们的表述伴随着严格的近似理论保证了所提出模型的普遍表现。从经验上讲,我们评估了LOCA在涉及普通和部分微分方程的系统的几个操作员学习方案以及黑盒气候预测问题上的性能。通过这些方案,我们证明了最先进的状态,即有关嘈杂的输入数据的鲁棒性,以及在测试数据集上的错误量一致,即使是分布式预测任务也是如此。
Supervised operator learning is an emerging machine learning paradigm with applications to modeling the evolution of spatio-temporal dynamical systems and approximating general black-box relationships between functional data. We propose a novel operator learning method, LOCA (Learning Operators with Coupled Attention), motivated from the recent success of the attention mechanism. In our architecture, the input functions are mapped to a finite set of features which are then averaged with attention weights that depend on the output query locations. By coupling these attention weights together with an integral transform, LOCA is able to explicitly learn correlations in the target output functions, enabling us to approximate nonlinear operators even when the number of output function in the training set measurements is very small. Our formulation is accompanied by rigorous approximation theoretic guarantees on the universal expressiveness of the proposed model. Empirically, we evaluate the performance of LOCA on several operator learning scenarios involving systems governed by ordinary and partial differential equations, as well as a black-box climate prediction problem. Through these scenarios we demonstrate state of the art accuracy, robustness with respect to noisy input data, and a consistently small spread of errors over testing data sets, even for out-of-distribution prediction tasks.