论文标题
快速蒙特卡洛测试以进行优先采样
A fast Monte Carlo test for preferential sampling
论文作者
论文摘要
选择观察时空过程的选择位置的优先采样已被确定为多个领域的主要问题。当将标准统计方法应用于未经调整的情况下,将标准统计方法应用于优先采样的数据时,可以严重偏见该过程的预测。当前,在最受研究人员中最受欢迎的软件包中,很少实现可以调整优先采样的方法。此外,他们在技术上要求设计和合身。本文提出了快速直观的蒙特卡洛测试,用于检测优先采样。该测试可以应用于广泛的数据类型。重要的是,该方法还可以帮助发现一组信息的协变量,这些协变量可以充分控制优先采样。这些协变量的发现可以证明继续使用标准方法。提出了一项彻底的仿真研究,以证明在各种数据设置中测试的功率和有效性。该测试证明可以获得低至50的非高斯数据的高功率。最后,重新审视了两个先前发布的案例研究,并获得了有关信息抽样的性质的新见解。可以使用R软件包PSTEVER实现该测试
The preferential sampling of locations chosen to observe a spatio-temporal process has been identified as a major problem across multiple fields. Predictions of the process can be severely biased when standard statistical methodologies are applied to preferentially sampled data without adjustment. Currently, methods that can adjust for preferential sampling are rarely implemented in the software packages most popular with researchers. Furthermore, they are technically demanding to design and fit. This paper presents a fast and intuitive Monte Carlo test for detecting preferential sampling. The test can be applied across a wide range of data types. Importantly, the method can also help with the discovery of a set of informative covariates that can sufficiently control for the preferential sampling. The discovery of these covariates can justify continued use of standard methodologies. A thorough simulation study is presented to demonstrate both the power and validity of the test in various data settings. The test is shown to attain high power for non-Gaussian data with sample sizes as low as 50. Finally, two previously-published case studies are revisited and new insights into the nature of the informative sampling are gained. The test can be implemented with the R package PStestR