论文标题
非参数二元回归的最小风险和均匀收敛速率
Minimax Risk and Uniform Convergence Rates for Nonparametric Dyadic Regression
论文作者
论文摘要
令$ i = 1,\ ldots,n $索引一个简单的随机样本,这些单位从某些人口中得出。对于每个单元,我们观察回归器的向量$ x_ {i} $,对于$ n \ left(n-1 \右)$订购的单位对,结果$ y_ {ij} $。结果$ y_ {ij} $和$ y_ {kl} $是独立的,如果它们的索引不相交,但依赖于其他依赖(即“二型依赖”)。令$ w_ {ij} = \ left(x_ {i}',x_ {j}'\ right)'$;使用采样数据,我们试图构造平均回归函数的非参数估计$ g \ left(w_ {ij} \ right)\ overset {def} {\ equiv} \ equiv} \ mathbb {e} \ left [\ left [\ left [\ left.y_ { 我们提出两组结果。首先,我们计算了在(i)a点和(ii)在无穷大规范下估算回归函数的最小值风险的下限。其次,我们计算(i)的(i)和(ii)熟悉的Nadaraya-Watson(NW)内核回归估计器的二元类似物的均匀收敛速率。我们表明,当选择适当的带宽序列时,NW内核回归估计器实现了我们的风险界限所建议的最佳速率。此最佳速率与IID数据下的可用率有所不同:有效样本量较小,$ d_w = \ mathrm {dim}(w_ {ij})$对速率的影响有所不同。
Let $i=1,\ldots,N$ index a simple random sample of units drawn from some large population. For each unit we observe the vector of regressors $X_{i}$ and, for each of the $N\left(N-1\right)$ ordered pairs of units, an outcome $Y_{ij}$. The outcomes $Y_{ij}$ and $Y_{kl}$ are independent if their indices are disjoint, but dependent otherwise (i.e., "dyadically dependent"). Let $W_{ij}=\left(X_{i}',X_{j}'\right)'$; using the sampled data we seek to construct a nonparametric estimate of the mean regression function $g\left(W_{ij}\right)\overset{def}{\equiv}\mathbb{E}\left[\left.Y_{ij}\right|X_{i},X_{j}\right].$ We present two sets of results. First, we calculate lower bounds on the minimax risk for estimating the regression function at (i) a point and (ii) under the infinity norm. Second, we calculate (i) pointwise and (ii) uniform convergence rates for the dyadic analog of the familiar Nadaraya-Watson (NW) kernel regression estimator. We show that the NW kernel regression estimator achieves the optimal rates suggested by our risk bounds when an appropriate bandwidth sequence is chosen. This optimal rate differs from the one available under iid data: the effective sample size is smaller and $d_W=\mathrm{dim}(W_{ij})$ influences the rate differently.