论文标题
贝叶斯网络学习教程
A Tutorial on Learning With Bayesian Networks
论文作者
论文摘要
贝叶斯网络是一个图形模型,该模型编码了感兴趣的变量之间的概率关系。当与统计技术结合使用时,图形模型具有数据分析的几个优点。第一,由于模型编码所有变量之间的依赖关系,因此它很容易处理某些数据条目丢失的情况。第二,贝叶斯网络可用于学习因果关系,因此可以用来了解问题领域并预测干预的后果。第三,因为该模型既具有因果关系,又具有概率语义,因此它是结合先验知识(通常以因果形式)和数据结合的理想表示。第四,贝叶斯统计方法与贝叶斯网络结合使用,提供了一种有效而有原则的方法来避免数据过度拟合。在本文中,我们讨论了从先验知识中构建贝叶斯网络的方法,并总结了使用数据改进这些模型的贝叶斯统计方法。关于后一个任务,我们描述了学习贝叶斯网络的参数和结构的方法,包括使用不完整数据学习的技术。此外,我们将学习贝叶斯网络方法与学习技术进行了监督和无监督的学习。我们使用现实世界案例研究说明了图形模型方法。
A Bayesian network is a graphical model that encodes probabilistic relationships among variables of interest. When used in conjunction with statistical techniques, the graphical model has several advantages for data analysis. One, because the model encodes dependencies among all variables, it readily handles situations where some data entries are missing. Two, a Bayesian network can be used to learn causal relationships, and hence can be used to gain understanding about a problem domain and to predict the consequences of intervention. Three, because the model has both a causal and probabilistic semantics, it is an ideal representation for combining prior knowledge (which often comes in causal form) and data. Four, Bayesian statistical methods in conjunction with Bayesian networks offer an efficient and principled approach for avoiding the overfitting of data. In this paper, we discuss methods for constructing Bayesian networks from prior knowledge and summarize Bayesian statistical methods for using data to improve these models. With regard to the latter task, we describe methods for learning both the parameters and structure of a Bayesian network, including techniques for learning with incomplete data. In addition, we relate Bayesian-network methods for learning to techniques for supervised and unsupervised learning. We illustrate the graphical-modeling approach using a real-world case study.