论文标题
灵活的贝叶斯非线性模型配置
Flexible Bayesian Nonlinear Model Configuration
论文作者
论文摘要
回归模型用于广泛的应用,为来自不同领域的研究人员提供了强大的科学工具。线性或简单的参数模型通常不足以描述输入变量与响应之间的复杂关系。可以通过诸如神经网络之类的灵活方法更好地描述这种关系,但这会导致不容易解释的模型和潜在的过度拟合。或者,可以使用特定的参数非线性函数,但是这种功能的规范通常是复杂的。在本文中,我们引入了一种灵活的方法,用于构建和选择高度灵活的非线性参数回归模型。非线性特征在层次上生成,类似于深度学习,但在可能要考虑的功能类型上具有额外的灵活性。这种灵活性与可变选择相结合,使我们能够找到一系列重要的功能,从而提供更多可解释的模型。在可能的功能的空间内,考虑了一种贝叶斯的方法,即基于其复杂性引入函数的先验。采用转基因的Markov链蒙特卡洛算法来执行贝叶斯推断,并估算模型平均的后验概率。在各种应用中,我们说明了如何使用我们的方法来获得有意义的非线性模型。此外,我们将其预测性能与多种机器学习算法进行了比较。
Regression models are used in a wide range of applications providing a powerful scientific tool for researchers from different fields. Linear, or simple parametric, models are often not sufficient to describe complex relationships between input variables and a response. Such relationships can be better described through flexible approaches such as neural networks, but this results in less interpretable models and potential overfitting. Alternatively, specific parametric nonlinear functions can be used, but the specification of such functions is in general complicated. In this paper, we introduce a flexible approach for the construction and selection of highly flexible nonlinear parametric regression models. Nonlinear features are generated hierarchically, similarly to deep learning, but have additional flexibility on the possible types of features to be considered. This flexibility, combined with variable selection, allows us to find a small set of important features and thereby more interpretable models. Within the space of possible functions, a Bayesian approach, introducing priors for functions based on their complexity, is considered. A genetically modified mode jumping Markov chain Monte Carlo algorithm is adopted to perform Bayesian inference and estimate posterior probabilities for model averaging. In various applications, we illustrate how our approach is used to obtain meaningful nonlinear models. Additionally, we compare its predictive performance with several machine learning algorithms.