论文标题

使用随机$ p $值对复合零假设进行离散数据的多次测试

Multiple testing of composite null hypotheses for discrete data using randomized $p$-values

论文作者

Ochieng, Daniel, Hoang, Anh-Tuan, Dickhaus, Thorsten

论文摘要

在零假设中,在最小有利的参数配置(LFC)下,$ P $ - 从连续分布的测试统计数据中得出的$ P $值通常在$(0,1)$上均匀分布。 $ p $ - 价值$ p $的保守性(这意味着$ p $在零假设下,随机变量大于随机变量,如果从$(0,1)$上均匀分布的随机变量,则如果从$(0,1)$上均分布在$(0,1)$上,则如果从$ p $中得出$ p $的测试统计量是离散的,或者如果null下的真实参数值不是LFC。为了处理这两种保守性的来源,我们提出了使用随机$ p $值的两种方法,即单阶段和两阶段随机化。我们说明了它们在二项式模型下测试复合零假设的有效性。我们还举例说明了如何使用建议的$ p $值来测试组测试设计中的复合零。与以前的发现相似,我们发现所提出的随机$ p $值与无伴奏的$ p $值相比,在零假设下的保守程度较低,但在替代方案下,它们在随机方面并不小。建立随机$ p $价值的有效性的问题并不是微不足道的,并且在以前的文献中受到了关注。我们表明,我们提出的随机$ p $值在各种离散统计模型下有效,这些模型的分布属于相应的测试统计量属于指数族。还研究了基于建议的随机$ p $值作为样本量的函数的功率函数的行为。模拟和实际数据分析用于比较不同考虑的$ p $值。

$P$-values that are derived from continuously distributed test statistics are typically uniformly distributed on $(0,1)$ under least favorable parameter configurations (LFCs) in the null hypothesis. Conservativeness of a $p$-value $P$ (meaning that $P$ is under the null hypothesis stochastically larger than a random variable which is uniformly distributed on $(0,1)$) can occur if the test statistic from which $P$ is derived is discrete, or if the true parameter value under the null is not an LFC. To deal with both of these sources of conservativeness, we present two approaches utilizing randomized $p$-values, namely single-stage and two-stage randomization. We illustrate their effectiveness for testing a composite null hypothesis under a binomial model. We also give an example of how the proposed $p$-values can be used to test a composite null in group testing designs. Similar to previous findings, we find that the proposed randomized $p$-values are less conservative compared to non-randomized $p$-values under the null hypothesis, but that they are stochastically not smaller under the alternative. The problem of establishing the validity of randomized $p$-values is not trivial and has received attention in previous literature. We show that our proposed randomized $p$-values are valid under various discrete statistical models which are such that the distribution of the corresponding test statistic belongs to an exponential family. The behaviour of the power function for the tests based on the proposed randomized $p$-values as a function of the sample size is also investigated. Simulations and a real data analysis are used to compare the different considered $p$-values.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源