论文标题
具有量化不确定性的可解释模型的自动学习
Automated Learning of Interpretable Models with Quantified Uncertainty
论文作者
论文摘要
机器学习中的可解释性和不确定性量化可以为决策提供理由,促进科学发现并更好地理解模型行为。符号回归提供了固有的可解释的机器学习,但是相对较少的工作集中在嘈杂数据上的符号回归以及量化不确定性的必要性上。引入了一种新的用于基于基因编程的符号回归(GPSR)的贝叶斯框架,该框架使用模型证据(即,边际可能性)在进化的选择阶段在进化阶段表达替代概率。模型参数不确定性是自动量化的,可以通过GPSR算法产生的每个方程式实现概率预测。在此过程中还量化了模型证据,并且与在数值和物理实验上的常规GPSR实施相比,其使用显示可提高可解释性,提高噪声的鲁棒性并减少过度拟合。
Interpretability and uncertainty quantification in machine learning can provide justification for decisions, promote scientific discovery and lead to a better understanding of model behavior. Symbolic regression provides inherently interpretable machine learning, but relatively little work has focused on the use of symbolic regression on noisy data and the accompanying necessity to quantify uncertainty. A new Bayesian framework for genetic-programming-based symbolic regression (GPSR) is introduced that uses model evidence (i.e., marginal likelihood) to formulate replacement probability during the selection phase of evolution. Model parameter uncertainty is automatically quantified, enabling probabilistic predictions with each equation produced by the GPSR algorithm. Model evidence is also quantified in this process, and its use is shown to increase interpretability, improve robustness to noise, and reduce overfitting when compared to a conventional GPSR implementation on both numerical and physical experiments.