活跃地学习金簇的神经网络模型\＆bulk bulk bult稀疏的第一原理培训数据

论文标题

活跃地学习金簇的神经网络模型\＆bulk bulk bult稀疏的第一原理培训数据

Active Learning A Neural Network Model For Gold Clusters \& Bulk From Sparse First Principles Training Data

论文作者

Loeffler, Troy D, Manna, Sukriti, Patra, Tarak K, Chan, Henry, Narayanan, Badri, Sankaranarayanan, Subramanian

论文摘要

小型金属簇具有基本的科学兴趣，并且在催化中具有巨大的意义。这些纳米级簇显示出不同的几何形状和结构图案，具体取决于簇的大小。对这种尺寸依赖性结构基序及其动力学进化的了解一直具有长期的兴趣。经典MD通常采用预定义的功能形式，从而限制了它们捕获这种复杂尺寸依赖性结构和动态转换的能力。基于神经网络（NN）的电位代表灵活的替代方案，从原则上讲，训练有素的NN电位可以通过用于训练的参考模型提供高水平的灵活性，可传递性和准确性。但是，一个主要的挑战是，NN模型具有插值，需要大量的训练数据，以确保该模型可以充分采样近外和远程衡量的能量景观。在这里，我们介绍了一种主动学习（AL）方案，该方案以最少数量的基于第一原理的培训数据训练NN模型。我们的AL工作流程是通过稀疏培训数据集（1到5个数据点）启动的，并通过嵌套的合奏蒙特卡洛计划即时更新，该计划迭代地查询故障区域中的能量景观，并更新培训池以改善网络性能。使用代表性的金簇系统，我们证明了我们的AL工作流可以训练约500个总参考计算的NN。我们的NN预测在30 MEV/ATOM内，40 MeV/Å对于参考DFT计算。此外，我们的AL-NN模型还充分捕获了与DFT计算和可用实验相吻合的黄金簇的各种尺寸依赖性结构和动力学特性。

Small metal clusters are of fundamental scientific interest and of tremendous significance in catalysis. These nanoscale clusters display diverse geometries and structural motifs depending on the cluster size; a knowledge of this size-dependent structural motifs and their dynamical evolution has been of longstanding interest. Classical MD typically employ predefined functional forms which limits their ability to capture such complex size-dependent structural and dynamical transformation. Neural Network (NN) based potentials represent flexible alternatives and in principle, well-trained NN potentials can provide high level of flexibility, transferability and accuracy on-par with the reference model used for training. A major challenge, however, is that NN models are interpolative and requires large quantities of training data to ensure that the model adequately samples the energy landscape both near and far-from-equilibrium. Here, we introduce an active learning (AL) scheme that trains a NN model on-the-fly with minimal amount of first-principles based training data. Our AL workflow is initiated with a sparse training dataset (1 to 5 data points) and is updated on-the-fly via a Nested Ensemble Monte Carlo scheme that iteratively queries the energy landscape in regions of failure and updates the training pool to improve the network performance. Using a representative system of gold clusters, we demonstrate that our AL workflow can train a NN with ~500 total reference calculations. Our NN predictions are within 30 meV/atom and 40 meV/Åof the reference DFT calculations. Moreover, our AL-NN model also adequately captures the various size-dependent structural and dynamical properties of gold clusters in excellent agreement with DFT calculations and available experiments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题