论文标题
学习进行化学预测:功能表示,数据和机器学习算法的相互作用
Learning to Make Chemical Predictions: the Interplay of Feature Representation, Data, and Machine Learning Algorithms
论文作者
论文摘要
最近有监督的机器学习一直在为化学,生物学和材料科学应用提供新的预测方法。从这个角度来看,我们着重于机器学习算法与化学动机描述符以及分子属性预测所需的数据集的大小和类型的相互作用。以核磁共振化学移位预测为例,我们证明了成功的基于化学结构提取的特征或真实空间表示的选择,无论分子属性数据是否丰富和/或实验或计算衍生而来,以及这些共同的共同选择将如何影响流行的机器学习算法的正确选择,从深度学习,随机的森林,随机的森林,随机的森林,随机的森林,或Kernel Metags。
Recently supervised machine learning has been ascending in providing new predictive approaches for chemical, biological and materials sciences applications. In this Perspective we focus on the interplay of machine learning algorithm with the chemically motivated descriptors and the size and type of data sets needed for molecular property prediction. Using Nuclear Magnetic Resonance chemical shift prediction as an example, we demonstrate that success is predicated on the choice of feature extracted or real-space representations of chemical structures, whether the molecular property data is abundant and/or experimentally or computationally derived, and how these together will influence the correct choice of popular machine learning algorithms drawn from deep learning, random forests, or kernel methods.