论文标题
HSIC和DHSIC的排列测试的一致性
Consistency of permutation tests for HSIC and dHSIC
论文作者
论文摘要
Hilbert--Schmidt独立标准(HSIC)是两个随机变量之间依赖关系的流行度量。统计DHSIC是HSIC的扩展,可用于测试$ d $随机变量的联合独立性。这种对(关节)独立性的假设检验通常是使用置换测试进行的,该测试将观察到的数据与随机置换的数据集进行了比较。这项工作的主要贡献证明,随着样本量收敛到无穷大,这种独立性测试的力量会收敛到1。这回答了(Pfister,2018年)中提出的一个问题,此外,这项工作证明了正确的1型HSIC和DHSIC排列测试错误率,并提供了有关如何在实践中选择排列数量的指导。虽然在(Pfister,2018)中已经证明了正确的1型错误率,但我们提供了一个修改后的证据(Berrett,2019),该证明扩展到非连续数据的情况。研究了使用的置换次数,例如(Marozzi,2004年),但在HSIC的背景下没有,在$ p $价值的估计和排列的估计值中,而不是排列的向量。尽管最后两个点的新颖性有限,但我们包括这些点以对HSIC和DHSIC的背景下的置换测试进行完整概述。
The Hilbert--Schmidt Independence Criterion (HSIC) is a popular measure of the dependency between two random variables. The statistic dHSIC is an extension of HSIC that can be used to test joint independence of $d$ random variables. Such hypothesis testing for (joint) independence is often done using a permutation test, which compares the observed data with randomly permuted datasets. The main contribution of this work is proving that the power of such independence tests converges to 1 as the sample size converges to infinity. This answers a question that was asked in (Pfister, 2018) Additionally this work proves correct type 1 error rate of HSIC and dHSIC permutation tests and provides guidance on how to select the number of permutations one uses in practice. While correct type 1 error rate was already proved in (Pfister, 2018), we provide a modified proof following (Berrett, 2019), which extends to the case of non-continuous data. The number of permutations to use was studied e.g. by (Marozzi, 2004) but not in the context of HSIC and with a slight difference in the estimate of the $p$-value and for permutations rather than vectors of permutations. While the last two points have limited novelty we include these to give a complete overview of permutation testing in the context of HSIC and dHSIC.