论文标题

D-最佳设计中的聚类:反对线性化的情况

Clusterization in D-optimal designs: the case against linearization

论文作者

Daon, Yair

论文摘要

物理过程中参数的估计通常需要昂贵的测量,从而促使人们采用最佳测量策略。找到这种策略是最佳实验设计的问题,缩写为最佳设计。值得注意的是,最佳设计可以产生紧密聚集的测量位置,使研究人员从根本上修改了设计问题,以解决此问题。一些作者在最初独立的错误项之间介绍了错误相关性,而另一些作者将测量位置限制为有限的位置集。尽管两种方法都可以防止聚类,但它们也从根本上改变了最佳设计问题。 在这项研究中,我们考虑了贝叶斯的D-最佳设计,即〜设计,可以最大程度地提高后和先验之间预期的Kullback-Leibler差异。我们为希尔伯特空间上的D-最佳设计提出了一个可分析的模型。在此框架中,我们做出了一些关键贡献:(a)我们确定测量聚类是针对线性逆问题的通用特征,具有独立的高斯测量误差和高斯先验。 (b)我们证明,在测量误差术语中引入相关性会减轻聚类。 (c)我们将D-最佳设计表征为降低了先前协方差特征向量的一部分的不确定性。 (d)我们利用这种表征认为测量聚类是由于鸽子孔原理而产生的:当进行更多的测量值时,与在某些地方进行了更多的特征向量,而其他位置则很大,而其他的特征向量则很小 - 聚类发生。最后,我们使用分析来反对在寻求D-最佳设计时使用线性化物理模型的高斯先验。

Estimation of parameters in physical processes often demands costly measurements, prompting the pursuit of an optimal measurement strategy. Finding such strategy is termed the problem of optimal experimental design, abbreviated as optimal design. Remarkably, optimal designs can yield tightly clustered measurement locations, leading researchers to fundamentally revise the design problem just to circumvent this issue. Some authors introduce error correlation among error terms that are initially independent, while others restrict measurement locations to a finite set of locations. While both approaches may prevent clusterization, they also fundamentally alter the optimal design problem. In this study, we consider Bayesian D-optimal designs, i.e.~designs that maximize the expected Kullback-Leibler divergence between posterior and prior. We propose an analytically tractable model for D-optimal designs over Hilbert spaces. In this framework, we make several key contributions: (a) We establish that measurement clusterization is a generic trait of D-optimal designs for linear inverse problems with independent Gaussian measurement errors and a Gaussian prior. (b) We prove that introducing correlations among measurement error terms mitigates clusterization. (c) We characterize D-optimal designs as reducing uncertainty across a subset of prior covariance eigenvectors. (d) We leverage this characterization to argue that measurement clusterization arises as a consequence of the pigeonhole principle: when more measurements are taken than there are locations where the select eigenvectors are large and others are small -- clusterization occurs. Finally, we use our analysis to argue against the use of Gaussian priors with linearized physical models when seeking a D-optimal design.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源