高维的截短线性回归

论文标题

高维的截短线性回归

Truncated Linear Regression in High Dimensions

论文作者

Daskalakis, Constantinos, Rohatgi, Dhruv, Zampetakis, Manolis

论文摘要

与标准线性回归中一样，在截断线性回归中，我们可以访问观测值$（a_i，y_i）_i $，其因变量等于$ y_i = a_i = a_i^{\ rm t} \ cdot x^*^* +η_i$，其中$ x^* $ x^* $ x^$是一些固定的固定量和$ $ $ $ quies $η_i$η_i$η_i$η_i$η_i$η_i;除非我们只有一个观察结果，但如果其因变量$ y_i $位于某些“截断集” $ s \ subset \ mathbb {r} $中。目标是在$ a_i $和噪声分布的某些有利条件下恢复$ x^*$。我们证明存在一种计算和统计上有效的方法，用于恢复$ k $ -sparse $ n $ - 维矢量$ x^*$来自$ m $ truncated样品，该样本达到了最佳$ \ ell_2 $重建错误的$ o（\ sqrt {（k \ log n）/m m}）$。作为推论，我们的保证意味着一种计算有效的，理论上的最佳算法，用于截断，这可能是由测量饱和效应引起的。我们的结果是从随机梯度下降（SGD）算法的统计和计算分析来解决的，该算法解决了适应截断的LASSO优化问题的自然适应性。这概括了两者的作品：（1）[Daskalakis等。 2018]，由于数据的低维度，不需要正规化，以及（2）[Wainright 2009]，由于没有截断，目标函数很简单。为了同时处理截断和高维度，我们开发了新技术不仅概括了现有的技术，而且我们认为具有独立的利益。

As in standard linear regression, in truncated linear regression, we are given access to observations $(A_i, y_i)_i$ whose dependent variable equals $y_i= A_i^{\rm T} \cdot x^* + η_i$, where $x^*$ is some fixed unknown vector of interest and $η_i$ is independent noise; except we are only given an observation if its dependent variable $y_i$ lies in some "truncation set" $S \subset \mathbb{R}$. The goal is to recover $x^*$ under some favorable conditions on the $A_i$'s and the noise distribution. We prove that there exists a computationally and statistically efficient method for recovering $k$-sparse $n$-dimensional vectors $x^*$ from $m$ truncated samples, which attains an optimal $\ell_2$ reconstruction error of $O(\sqrt{(k \log n)/m})$. As a corollary, our guarantees imply a computationally efficient and information-theoretically optimal algorithm for compressed sensing with truncation, which may arise from measurement saturation effects. Our result follows from a statistical and computational analysis of the Stochastic Gradient Descent (SGD) algorithm for solving a natural adaptation of the LASSO optimization problem that accommodates truncation. This generalizes the works of both: (1) [Daskalakis et al. 2018], where no regularization is needed due to the low-dimensionality of the data, and (2) [Wainright 2009], where the objective function is simple due to the absence of truncation. In order to deal with both truncation and high-dimensionality at the same time, we develop new techniques that not only generalize the existing ones but we believe are of independent interest.

下载PDF全文

下载文献需遵守相关版权规定

论文标题