论文标题
平滑总变化的限制分布和高维
Limit Distribution for Smooth Total Variation and $χ^2$-Divergence in High Dimensions
论文作者
论文摘要
统计差异在机器学习中无处不在,作为测量概率分布之间差异的工具。由于这些应用程序固有地依赖于样本中的近似分布,因此我们考虑了两个流行的$ f $ -Divergences下的经验近似:总变化(TV)距离(TV)距离和$χ^2 $ divergence。为了避免这些差异支持不匹配的敏感性,采用了高斯平滑的框架。我们研究$ \ sqrt {n}δ_ {\ Mathsf {tv}}}}}(p_n \ ast \ ast \ Mathcal {n},p \ ast \ ast \ mathcal {n})$和$nχ^2(p_n \ ast \ ast \ aST \ nat} $ | p_n} $ |其中$ p_n $是基于$ n $独立且相同分布的(i.i.d.)的实证措施,从$ p $,$ \ mathcal {n}_σ:= \ mathcal {n}(0,σ^^2^2^2 \ 2 \ Mathrm {i} _d _d _d)$和$ \ ast $ asts $ asts $ asts $在任意维度中,限制分布的特征是在$ \ mathbb {r}^d $上与依赖于$ p $的协方差算子和二等见的参数$σ$的同位素高斯密度。反过来,这意味着$ n^{ - 1/2} $预期值收敛速率最近以$δ_ {\ mathsf {tv}}(p_n \ ast \ ast \ ast \ astcal {n},p \ ast ast ast {N})$和$χ^2(p_n \ ast \ mathcal {n} \ | p \ ast \ mathcal {n})$。这些强大的统计保证可以在高斯平滑下促进经验近似,作为基于高维数据的学习和推理的有效框架。
Statistical divergences are ubiquitous in machine learning as tools for measuring discrepancy between probability distributions. As these applications inherently rely on approximating distributions from samples, we consider empirical approximation under two popular $f$-divergences: the total variation (TV) distance and the $χ^2$-divergence. To circumvent the sensitivity of these divergences to support mismatch, the framework of Gaussian smoothing is adopted. We study the limit distributions of $\sqrt{n}δ_{\mathsf{TV}}(P_n\ast\mathcal{N},P\ast\mathcal{N})$ and $nχ^2(P_n\ast\mathcal{N}\|P\ast\mathcal{N})$, where $P_n$ is the empirical measure based on $n$ independently and identically distributed (i.i.d.) observations from $P$, $\mathcal{N}_σ:=\mathcal{N}(0,σ^2\mathrm{I}_d)$, and $\ast$ stands for convolution. In arbitrary dimension, the limit distributions are characterized in terms of Gaussian process on $\mathbb{R}^d$ with covariance operator that depends on $P$ and the isotropic Gaussian density of parameter $σ$. This, in turn, implies optimality of the $n^{-1/2}$ expected value convergence rates recently derived for $δ_{\mathsf{TV}}(P_n\ast\mathcal{N},P\ast\mathcal{N})$ and $χ^2(P_n\ast\mathcal{N}\|P\ast\mathcal{N})$. These strong statistical guarantees promote empirical approximation under Gaussian smoothing as a potent framework for learning and inference based on high-dimensional data.