论文标题
repro样品方法用于有限样本和大样本推断
Repro Samples Method for Finite- and Large-Sample Inferences
论文作者
论文摘要
本文介绍了一种新颖,一般和有效的模拟启发的方法,称为{\ it repro samples方法},以进行统计推断。该方法研究了人造样品的性能,称为{\ it repro samples},通过模拟观察到的样本以实现不确定性定量并构建置信集,并构建了保证覆盖率的置信度。确切的和渐近的推论都是开发出来的。开发的一般框架的一个有吸引力的特征是,它不依赖大型样品中心限制定理,并且很可能。因此,它对于复杂的推理问题是有效的,我们无法使用大型样品中心限制定理解决。所提出的方法适用于广泛的问题,包括许多以前无法使用的解决方案,例如,涉及离散或非数字参数的问题。为了减少此类推理问题的大量计算成本,我们开发了一个独特的匹配方案,以获取数据驱动的候选人集。此外,我们展示了所提出的框架比古典尼曼·佩森框架的优势。我们证明了所提出的方法在整个论文中对各种模型的有效性,并提供了一个案例研究,该案例研究解决了如何量化正常混合物模型中未知数组件的不确定性的开放性问题。为了评估我们的repro样品方法的经验性能,我们进行了模拟并研究了与现有方法进行比较的真实数据示例。尽管开发与不适用大型样品中心限制定理的设置有关,但它也直接扩展到中央限制定理确实存在的情况。
This article presents a novel, general, and effective simulation-inspired approach, called {\it repro samples method}, to conduct statistical inference. The approach studies the performance of artificial samples, referred to as {\it repro samples}, obtained by mimicking the true observed sample to achieve uncertainty quantification and construct confidence sets for parameters of interest with guaranteed coverage rates. Both exact and asymptotic inferences are developed. An attractive feature of the general framework developed is that it does not rely on the large sample central limit theorem and is likelihood-free. As such, it is thus effective for complicated inference problems which we can not solve using the large sample central limit theorem. The proposed method is applicable to a wide range of problems, including many open questions where solutions were previously unavailable, for example, those involving discrete or non-numerical parameters. To reduce the large computational cost of such inference problems, we develop a unique matching scheme to obtain a data-driven candidate set. Moreover, we show the advantages of the proposed framework over the classical Neyman-Pearson framework. We demonstrate the effectiveness of the proposed approach on various models throughout the paper and provide a case study that addresses an open inference question on how to quantify the uncertainty for the unknown number of components in a normal mixture model. To evaluate the empirical performance of our repro samples method, we conduct simulations and study real data examples with comparisons to existing approaches. Although the development pertains to the settings where the large sample central limit theorem does not apply, it also has direct extensions to the cases where the central limit theorem does hold.