论文标题
SO3Krates:对分子系统中任意长度尺度相互作用的均等关注
So3krates: Equivariant attention for interactions on arbitrary length-scales in molecular systems
论文作者
论文摘要
机器学习方法在量子化学中的应用使得对许多化学现象进行了研究,这些化学现象在计算上使用传统的AB-Initio方法很棘手。但是,分子和材料的某些量子机械性能取决于非本地电子效应,由于难以有效地对其进行建模,因此通常被忽略了。这项工作提出了一种适合基础物理学的改良注意机制,该机制允许恢复相关的非本地效应。也就是说,我们引入了球形谐波坐标(SPHC),以反映分子中每个原子的高阶几何信息,从而实现了SPHC空间中注意力的非本地表述。我们提出的模型SO3Krates(基于自我注意力的消息传递神经网络)来自原子特征的几何信息,使它们独立于注意机制。因此,我们构建了球形过滤器,该滤波器将欧几里得空间中连续过滤器的概念扩展到SPHC空间,并作为球形自我注意机制的基础。我们表明,与其他已发表的方法相比,SO3Krates能够描述在任意长度尺度上的非本地量子机械效应。此外,我们发现证据表明,包括高阶几何相关性可以提高数据效率并提高概括。 SO3Krates匹配或超过流行基准的最先进性能,尤其是需要较低的参数(0.25-0.4倍),而与其他型号相比,同时进行了大量速度(训练6-14倍,推理2-11倍)。
The application of machine learning methods in quantum chemistry has enabled the study of numerous chemical phenomena, which are computationally intractable with traditional ab-initio methods. However, some quantum mechanical properties of molecules and materials depend on non-local electronic effects, which are often neglected due to the difficulty of modeling them efficiently. This work proposes a modified attention mechanism adapted to the underlying physics, which allows to recover the relevant non-local effects. Namely, we introduce spherical harmonic coordinates (SPHCs) to reflect higher-order geometric information for each atom in a molecule, enabling a non-local formulation of attention in the SPHC space. Our proposed model So3krates - a self-attention based message passing neural network - uncouples geometric information from atomic features, making them independently amenable to attention mechanisms. Thereby we construct spherical filters, which extend the concept of continuous filters in Euclidean space to SPHC space and serve as foundation for a spherical self-attention mechanism. We show that in contrast to other published methods, So3krates is able to describe non-local quantum mechanical effects over arbitrary length scales. Further, we find evidence that the inclusion of higher-order geometric correlations increases data efficiency and improves generalization. So3krates matches or exceeds state-of-the-art performance on popular benchmarks, notably, requiring a significantly lower number of parameters (0.25 - 0.4x) while at the same time giving a substantial speedup (6 - 14x for training and 2 - 11x for inference) compared to other models.