HSIC和DHSIC的排列测试的一致性

论文标题

HSIC和DHSIC的排列测试的一致性

Consistency of permutation tests for HSIC and dHSIC

论文作者

Rindt, David, Sejdinovic, Dino, Steinsaltz, David

论文摘要

Hilbert--Schmidt独立标准（HSIC）是两个随机变量之间依赖关系的流行度量。统计DHSIC是HSIC的扩展，可用于测试$ d $随机变量的联合独立性。这种对（关节）独立性的假设检验通常是使用置换测试进行的，该测试将观察到的数据与随机置换的数据集进行了比较。这项工作的主要贡献证明，随着样本量收敛到无穷大，这种独立性测试的力量会收敛到1。这回答了（Pfister，2018年）中提出的一个问题，此外，这项工作证明了正确的1型HSIC和DHSIC排列测试错误率，并提供了有关如何在实践中选择排列数量的指导。虽然在（Pfister，2018）中已经证明了正确的1型错误率，但我们提供了一个修改后的证据（Berrett，2019），该证明扩展到非连续数据的情况。研究了使用的置换次数，例如（Marozzi，2004年），但在HSIC的背景下没有，在$ p $价值的估计和排列的估计值中，而不是排列的向量。尽管最后两个点的新颖性有限，但我们包括这些点以对HSIC和DHSIC的背景下的置换测试进行完整概述。

The Hilbert--Schmidt Independence Criterion (HSIC) is a popular measure of the dependency between two random variables. The statistic dHSIC is an extension of HSIC that can be used to test joint independence of $d$ random variables. Such hypothesis testing for (joint) independence is often done using a permutation test, which compares the observed data with randomly permuted datasets. The main contribution of this work is proving that the power of such independence tests converges to 1 as the sample size converges to infinity. This answers a question that was asked in (Pfister, 2018) Additionally this work proves correct type 1 error rate of HSIC and dHSIC permutation tests and provides guidance on how to select the number of permutations one uses in practice. While correct type 1 error rate was already proved in (Pfister, 2018), we provide a modified proof following (Berrett, 2019), which extends to the case of non-continuous data. The number of permutations to use was studied e.g. by (Marozzi, 2004) but not in the context of HSIC and with a slight difference in the estimate of the $p$-value and for permutations rather than vectors of permutations. While the last two points have limited novelty we include these to give a complete overview of permutation testing in the context of HSIC and dHSIC.

下载PDF全文

下载文献需遵守相关版权规定

论文标题