论文标题
对机器学习形成能的复合稳定性预测的批判性检查
A critical examination of compound stability predictions from machine-learned formation energies
论文作者
论文摘要
机器学习已成为有效预测材料特性的新工具,并且已经提出了机器学习模型的化合物形成能量可以接近密度功能理论(DFT)的准确性。这项工作中测试的模型包括五个最近发布的组成模型,仅使用化学计量学的基线模型和一个结构模型。通过使用材料项目数据库进行85,014个独特的化学成分的材料项目数据库测试七个机器学习模型,以在稳定性预测上进行地层能量,我们表明,虽然确实可以很好地预测地层能量,但所有组成模型在预测化合物的稳定性方面的表现较差,使其比DFT对DFT的稳定性比DFT较少,从而使DFT对新固体的发现和新固体设计的稳定性差。最关键的是,在稀疏的化学物质稀疏化学空间中,只有结构模型才能有效检测哪种材料稳定。与组成模型相比,结构模型的非附件改进值得注意,并鼓励将结构模型用于材料发现,并受到限制:对于任何新组成,地面结构都不是先验的。这项工作表明,对形成能量的准确预测并不意味着对稳定性的准确预测,强调评估模型性能在稳定性预测上的重要性,我们为此提供了一系列公开可用的测试。
Machine learning has emerged as a novel tool for the efficient prediction of materials properties, and claims have been made that machine-learned models for the formation energy of compounds can approach the accuracy of Density Functional Theory (DFT). The models tested in this work include five recently published compositional models, a baseline model using stoichiometry alone, and a structural model. By testing seven machine learning models for formation energy on stability predictions using the Materials Project database of DFT calculations for 85,014 unique chemical compositions, we show that while formation energies can indeed be predicted well, all compositional models perform poorly on predicting the stability of compounds, making them considerably less useful than DFT for the discovery and design of new solids. Most critically, in sparse chemical spaces where few stoichiometries have stable compounds, only the structural model is capable of efficiently detecting which materials are stable. The non-incremental improvement of structural models compared with compositional models is noteworthy and encourages the use of structural models for materials discovery, with the constraint that for any new composition, the ground-state structure is not known a priori. This work demonstrates that accurate predictions of formation energy do not imply accurate predictions of stability, emphasizing the importance of assessing model performance on stability predictions, for which we provide a set of publicly available tests.