论文标题

具有标记输出的计算机模型的分类

Classification of Computer Models with Labelled Outputs

论文作者

Kimpton, Louise, Challenor, Peter, Williamson, Daniel

论文摘要

分类是重要的工具,对于建模许多复杂的数值模型很重要。模型或系统可以使得在输入空间的某些领域,输出要么不存在,要么不存在可量化的形式。在这里,我们提出了一种新的分类方法,其中为模型输出提供了不同的分类标签,我们使用潜在的高斯过程(GP)对其进行建模。使用MCMC采样估算潜在变量,这是独特的可能性和不同的先前规格。然后,通过计算整个输入空间的错误分类率来验证我们的分类器。 比较与其他现有的分类方法(包括逻辑回归)进行了比较,该方法将被分类为两个区域之一的概率进行了建模。为了做出分类预测,我们从独立的Bernoulli分布中得出,这意味着距离相关性从独立的抽签中丢失,因此可能导致许多错误分类。通过使用潜在的GP对标签进行建模,我们的方法不会出现此问题。我们将新方法应用于一系列示例,包括一个激励示例,该示例模拟了与哺乳动物中生殖系统相关的激素,其中两个标记的输出是高且较低的繁殖率。

Classification is a vital tool that is important for modelling many complex numerical models. A model or system may be such that, for certain areas of input space, the output either does not exist, or is not in a quantifiable form. Here, we present a new method for classification where the model outputs are given distinct classifying labels, which we model using a latent Gaussian process (GP). The latent variable is estimated using MCMC sampling, a unique likelihood and distinct prior specifications. Our classifier is then verified by calculating a misclassification rate across the input space. Comparisons are made with other existing classification methods including logistic regression, which models the probability of being classified into one of two regions. To make classification predictions we draw from an independent Bernoulli distribution, meaning that distance correlation is lost from the independent draws and so can result in many misclassifications. By modelling the labels using a latent GP, this problem does not occur in our method. We apply our novel method to a range of examples including a motivating example which models the hormones associated with the reproductive system in mammals, where the two labelled outputs are high and low rates of reproduction.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源